Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datahub performance #11671

Open
pilipyukaaa opened this issue Oct 18, 2024 · 0 comments
Open

datahub performance #11671

pilipyukaaa opened this issue Oct 18, 2024 · 0 comments
Labels
bug Bug report

Comments

@pilipyukaaa
Copy link

pilipyukaaa commented Oct 18, 2024

Hello,
I have a problem with performance on process which consume messages from kafka and push changes in elasticsearch and neo4j
i was added this envs to my gms

  extraEnvs:
    - name: SPRING_KAFKA_PROPERTIES_MAX_POLL_RECORDS
      value: '10'
    - name: SPRING_KAFKA_PROPERTIES_MAX_POLL_INTERVAL_MS
      value: '120000'
    - name: ES_BULK_REQUESTS_LIMIT
      value: '1500'
    - name: ES_BULK_FLUSH_PERIOD
      value: '2'
    - name: LOGGING_LEVEL_ORG_APACHE_KAFKA_CLIENTS_CONSUMER
      value: DEBUG
    - name: LOGGING_LEVEL_ORG_SPRINGFRAMEWORK_KAFKA
      value: DEBUG
    - name: ELASTICSEARCH_THREAD_COUNT
      value: '15'
    - name: ES_BULK_ENABLE_BATCH_DELETE
      value: 'true'
    - name: LOGGING_LEVEL_ORG_APACHE_KAFKA_CLIENTS_CONSUMER
      value: DEBUG
    - name: LOGGING_LEVEL_ORG_SPRINGFRAMEWORK_KAFKA
      value: DEBUG
[2024-10-18 09:01:22,092 [I/O dispatcher 1] INFO  c.l.m.s.e.update.BulkListener:61 - Successfully fed bulk request 172. Number of events: 5 Took time ms: 3
2024-10-18 09:01:40,463 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook IncidentsSummaryHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD)
2024-10-18 09:01:40,463 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook IngestionSchedulerHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD)
2024-10-18 09:01:40,463 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook EntityChangeEventGeneratorHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD)
2024-10-18 09:01:40,463 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook SiblingAssociationHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD)
2024-10-18 09:01:40,463 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.h.s.SiblingAssociationHook:109 - Urn urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD) with aspect upstreamLineage received by Sibling Hook.
2024-10-18 09:01:40,467 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.h.s.SiblingAssociationHook:244 - Associating urn:li:dataset:(urn:li:dataPlatform:dbt,_bdm.bdm_dim_opportunity_view_final,PROD) and urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD) as siblings.
2024-10-18 09:01:40,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook FormAssignmentHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD)
2024-10-18 09:01:40,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:137 - Successfully completed MCL hooks for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_opportunity_view_final,PROD)
2024-10-18 09:01:40,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:82 - Got MCL event key: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD), topic: MetadataChangeLog_Versioned_v1, partition: 0, offset: 119678, value size: 143224, timestamp: 1729168196437
2024-10-18 09:01:40,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:106 - Invoking MCL hooks for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD), aspect name: upstreamLineage, entity type: dataset, change type: UPSERT
2024-10-18 09:01:40,474 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook UpdateIndicesHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
2024-10-18 09:01:40,479 [ThreadPoolTaskExecutor-1] INFO  c.l.m.s.e.update.ESBulkProcessor:82 - Added request id: EtcUX9vACyZAw/dPG+Inzw==, operation type: UPDATE, index: system_metadata_service_v1
2024-10-18 09:02:04,472 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook IncidentsSummaryHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
2024-10-18 09:02:04,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook IngestionSchedulerHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
2024-10-18 09:02:04,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook EntityChangeEventGeneratorHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
2024-10-18 09:02:04,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook SiblingAssociationHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
2024-10-18 09:02:04,473 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.h.s.SiblingAssociationHook:109 - Urn urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD) with aspect upstreamLineage received by Sibling Hook.
2024-10-18 09:02:04,476 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.h.s.SiblingAssociationHook:244 - Associating urn:li:dataset:(urn:li:dataPlatform:dbt,_bdm.bdm_dim_request,PROD) and urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD) as siblings.
2024-10-18 09:02:04,481 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:119 - Invoking MCL hook FormAssignmentHook for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
2024-10-18 09:02:04,481 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.MetadataChangeLogProcessor:137 - Successfully completed MCL hooks for urn: urn:li:dataset:(urn:li:dataPlatform:clickhouse,_bdm.bdm_dim_request,PROD)
](url)

but performance is very low, can you help me find bottleneck?

@pilipyukaaa pilipyukaaa added the bug Bug report label Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

1 participant