GoldenGate – Data Replication from Microsoft SQL Server to Big data (Kafka)

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #25955

    Summary
    Data Replication from SQL Server to Big data (Kafka) through GoldenGate Big data adapter. The Oracle Golden Gate for Big Data (OGG-BD) connected with apache Kafka to provide data replication in near real time. The intended solution will deploy the OGG and OGG-BD in source database server and install apache Kafka to extract and replicate data in AVRO or JSON format to big data.

    1.MS SQL Server, OGG Core, OGG-BD and apache Kafka has been installed in same machine.
    2.GoldenGate is used to extract the data and store the trial file in the server.
    3.OGG-BD replicates read the trial files and connect with apache Kafka producer convert the data in to AVRO/JSON message and replicate to Kafka topics.

    Software Requirements:

    Oracle Golden Gate 12.1.2.1.0
    Oracle Golden Gate for Big data 12.3.2.1
    MS-SQL Server 2014
    Apache Kafka 2.11 and above

    1.Enable Supplemental for SQL Server tables
    Login to the SQL Server database using OGG core GGSCI and enable the supplement log for replication required tables.
    Add trandata <<Table Name>>
    2.DSN needs to be created to access the database by extract.
    3.New extract process will be created.
    4.A new replicat process would be created for data replication into Kafka.
    5.DEFGEN file for data replication needs to be regenerated and copied to the OGG-BD/dirdef directory.
    6.Kafka Property files needs to be created this file will be called by replicate parameter file.
    7.The following files needs to be prepared to connect with kafka.
    Kafka Producer file/Jass.conf/keytab file/ trustore file

    Extract Parameter file:

    EXTRACT NAME1
    –Extract name should be within 8 character
    — Source goldengate user credentials.
    SOURCEDB GG_DBNAME, USERIDALIAS GG_USER
    TRANLOGOPTIONS MANAGESECONDARYTRUNCATIONPOINT
    TRANLOGOPTIONS QUERYRETRYCOUNT 3
    — EXT trail location
    — Get Insert/Updates/deletes/truncates
    GETINSERTS
    GETUPDATES
    GETDELETES
    GETTRUNCATES
    –Table list
    Table <<schemaname.TableName>>;
    Table <<schemaname.TableName>>;
    Table <<schemaname.TableName>>;

    Replicate Parameter file:

    REPLICAT NAME2
    TARGETDB LIBFILE ggjava.dll SET property=dirprm/kafka.props_op_avro_row_NAME2.txt
    — ASSUMETARGETDEFS
    SOURCEDEFS dirdef/defgen_NAME1.def
    REPORTCOUNT EVERY 1 MINUTES, RATE
    — GROUPTRANSOPS 1000
    GETINSERTS
    GETUPDATES
    GETDELETES
    GETTRUNCATES
    –Maping table list
    MAP <<schemaname.TableName>>,TARGET <<schemaname.TableName>>;
    MAP <<schemaname.TableName>>,TARGET <<schemaname.TableName>>;
    MAP <<schemaname.TableName>>,TARGET <<schemaname.TableName>>;

    Defgen Parameter file:

    DEFSFILE ./dirdef/defgen_NAME1.def, PURGE
    SOURCEDB DBNAME, USERIDALIAS goldengateuser
    –The following are the table defined in the defgen utility for further extract
    Table <<SchemaName.TableName>>;
    Table <<SchemaName.TableName>>;
    Table <<SchemaName.TableName>>;

    Execute the following command and find the defgen_NAME1.def file in c:/dirdef directory
    C:>defgen paramfile c:/dirprm/ defgen_NAME1.prm

    Kafka Property file:

    ###KAFKA Properties file ###
    gg.log=log4j
    gg.log.level=info
    gg.report.time=30sec
    ###Kafka Classpath settings ###
    ###gg.classpath=dirprm\\;C:\\kafka\\kafka_2.11-1.1.1\\libs\\* ###
    ###javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava\\ggjava.jar ###
    gg.classpath=dirprm\\;C:\\kafka\\kafka_2.11-1.1.1\\libs\\*
    javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava\\ggjava.jar -Djava.security.auth.login.config=C:\\hostname\\dircrd\\Prod\\jaas.conf -Djava.security.krb5.conf=C:\\hostname\\dircrd\\Prod\\krb5.conf
    ### Kafka handler properties ###
    gg.handlerlist=kafkahandler
    gg.handler.kafkahandler.type=kafka
    gg.handler.kafkahandler.KafkaProducerConfigFile= kafka_producer.properties
    gg.handler.kafkahandler.mode=op
    gg.handler.kafkahandler.format=json_row
    gg.handler.kafkahandler.topicMappingTemplate=<<TopicName>>
    gg.handler.kafkahandler.keyMappingTemplate=${currentTimestamp}
    gg.handler.kafkahandler.BlockingSend =false
    gg.handler.kafkahandler.includeTokens =true
    gg.handler.kafkahandler.format.insertOpKey=I
    gg.handler.kafkahandler.format.updateOpKey=U
    gg.handler.kafkahandler.format.deleteOpKey=D
    gg.handler.kafkahandler.format.truncateOpKey=T
    #gg.handler.kafkahandler.format.includePrimaryKeys=true
    gg.handler.kafkahandler.format.pkUpdateHandling=delete-insert
    ### native library config ###
    goldengate.userexit.timestamp=utc
    goldengate.userexit.writers=javawriter
    javawriter.stats.display=TRUE
    javawriter.stats.full=TRUE

    Kafka Producer file:

    bootstrap.servers=hostname (or) IP
    acks=all
    compression.type=gzip
    reconnect.backoff.ms=1000
    min.insync.replicas=2
    value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
    key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
    ##100KB per partition
    batch.size=102400
    max.request.size = 5024000
    send.buffer.bytes = 5024000
    linger.ms=1000

    #32109

    Thank you for good post.

Viewing 2 posts - 1 through 2 (of 2 total)
  • You must be logged in to reply to this topic.