GitHub - apache/flink-cdc: Flink CDC is a streaming data integration tool

Flink CDC is a distributed data integration tool for real-time data and batch data, built on top of Apache Flink. It prioritizes efficient end-to-end data integration and offers enhanced functionalities such as full database synchronization, sharding table synchronization, schema evolution and data transformation.

API Layers

Flink CDC provides three API layers for different usage scenarios:

1. YAML API (Pipeline API)

The YAML API provides a declarative, zero-code approach to define data pipelines. Users describe the source, sink, routing, transformation, and schema evolution rules in a YAML file and submit it via the flink-cdc.sh CLI.

Please refer to the Quickstart Guide for detailed setup instructions.

source:
  type: mysql
  hostname: localhost
  port: 3306
  username: root
  password: 123456
  tables: app_db.\.*

sink:
  type: doris
  fenodes: 127.0.0.1:8030
  username: root
  password: ""

# Transform data on-the-fly
transform:
  - source-table: app_db.orders
    projection: id, order_id, UPPER(product_name) as product_name
    filter: id > 10 AND order_id > 100
    
# Route source tables to different sink tables
route:
  - source-table: app_db.orders
    sink-table: ods_db.ods_orders
  - source-table: app_db.shipments
    sink-table: ods_db.ods_shipments
  - source-table: app_db.\.*
    sink-table: ods_db.others

pipeline:
  name: Sync MySQL Database to Doris
  parallelism: 2
  schema.change.behavior: evolve  # Support schema evolution

Pipeline connectors:

See the connector overview for a full list and configurations.

2. SQL API (Table/SQL API)

The SQL API integrates with Flink SQL, allowing users to define CDC sources using SQL DDL statements. Deploy the SQL connector JAR to FLINK_HOME/lib/ and use it directly in Flink SQL Client:

CREATE TABLE mysql_binlog (
  id INT NOT NULL,
  name STRING,
  description STRING,
  weight DECIMAL(10,3),
  PRIMARY KEY(id) NOT ENFORCED
) WITH (
  'connector' = 'mysql-cdc',
  'hostname' = 'localhost',
  'port' = '3306',
  'username' = 'flinkuser',
  'password' = 'flinkpw',
  'database-name' = 'inventory',
  'table-name' = 'products'
);

SELECT id, UPPER(name), description, weight FROM mysql_binlog;

Available SQL connectors (dependencies bundled):

See the source connector overview for a full list and configurations.

3. DataStream API

The DataStream API provides programmatic access for building custom Flink streaming applications. Add the corresponding connector as a Maven dependency:

<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-connector-mysql-cdc</artifactId>
  <version>${flink-cdc.version}</version>
</dependency>

Available source connectors:

All artifacts use group ID org.apache.flink. See the DataStream API packaging guide for a complete pom.xml example.

Flink Version Compatibility

Flink CDC	Supported Flink Versions	Notes
3.6	1.20, 2.2
3.5	1.19, 1.20
3.4	1.19, 1.20
3.3	1.19, 1.20
3.2	1.17, 1.18, 1.19
3.1	1.16, 1.17, 1.18, 1.19	Only Flink CDC 3.1.1 supports Flink 1.19
3.0	1.14, 1.15, 1.16, 1.17, 1.18	Pipeline API requires Flink 1.17 and above
2.4	1.13, 1.14, 1.15, 1.16, 1.17	Flink CDC 1.x and 2.x does not support Pipeline API, same below
2.3	1.13, 1.14, 1.15, 1.16
2.2	1.13, 1.14
2.1	1.13
2.0	1.13
1.4	1.13
1.3	1.12
1.2	1.12
1.1	1.11
1.0	1.11

See the Pipeline connector overview and source connector overview for details.

Join the Community

There are many ways to participate in the Apache Flink CDC community. The mailing lists are the primary place where all Flink committers are present. For user support and questions use the user mailing list. If you've found a problem of Flink CDC, please create a Flink JIRA and tag it with the Flink CDC tag. Bugs and feature requests can either be discussed on the dev mailing list or on Jira.

Contributing

Welcome to contribute to Flink CDC, please see our Developer Guide and APIs Guide.

License

Apache 2.0 License.

Special Thanks

The Flink CDC community welcomes everyone who is willing to contribute, whether it's through submitting bug reports, enhancing the documentation, or submitting code contributions for bug fixes, test additions, or new feature development.
Thanks to all contributors for their enthusiastic contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 1,566 Commits
.github		.github
.idea		.idea
docs		docs
flink-cdc-cli		flink-cdc-cli
flink-cdc-common		flink-cdc-common
flink-cdc-composer		flink-cdc-composer
flink-cdc-connect		flink-cdc-connect
flink-cdc-dist		flink-cdc-dist
flink-cdc-e2e-tests		flink-cdc-e2e-tests
flink-cdc-flink1-compat		flink-cdc-flink1-compat
flink-cdc-flink2-compat		flink-cdc-flink2-compat
flink-cdc-pipeline-model		flink-cdc-pipeline-model
flink-cdc-pipeline-udf-examples		flink-cdc-pipeline-udf-examples
flink-cdc-runtime		flink-cdc-runtime
tools		tools
.asf.yaml		.asf.yaml
.dlc.json		.dlc.json
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

API Layers

1. YAML API (Pipeline API)

2. SQL API (Table/SQL API)

3. DataStream API

Flink Version Compatibility

Join the Community

Contributing

License

Special Thanks

About

Uh oh!

Releases 24

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

API Layers

1. YAML API (Pipeline API)

2. SQL API (Table/SQL API)

3. DataStream API

Flink Version Compatibility

Join the Community

Contributing

License

Special Thanks

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 24

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages