How to use Application Inspector

English | ν•œκΈ€

Application Inspector

1. Introduction

Application inspector provides an aggregate view of all the agent's resource data (cpu, memory, tps, datasource connection count, etc) registered under the same application name. A separate view is provided for the application inspector with stat charts similar to the agent inspector.

To access application inspector, click on the application inspector menu on the left side of the screen.

  • 1 : application inspector menu, 2 : application stat data

The Heap Usage chart above for example, shows the average(Avg), smallest(Min), greatest(Max) heap usage of the agents registered under the same application name along with the id of the agent that had the smallest/greatest heap usage at a certain point in time. The application inspector also provides other statistics found in the agent inspector in a similar fashion.

Application inspector requires flink and zookeeper. Please read on for more detail.

2. Architecture

A. Run a streaming job on flink. B. The taskmanager server is registered to zookeeper as a data node once the job starts. C. The Collector obtains the flink server info from zookeeper to create a tcp connection with it and starts sending agent data. D. The flink server aggregates data sent by the Collector and stores them into hbase.

3. Configuration

In order to enable application inspector, you will need to do the following and run Pinpoint.

A. Create ApplicationStatAggre table (refer to create table script), which stores application stat data.

B. Configure zookeeper address in Pinpoint-flink.properties which will be used to store flink's taskmanager server information.

    flink.cluster.enable=true
    flink.cluster.zookeeper.address=YOUR_ZOOKEEPER_ADDRESS
    flink.cluster.zookeeper.sessiontimeout=3000
    flink.cluster.zookeeper.retry.interval=5000
    flink.cluster.tcp.port=19994

C. Configure job execution type and the number of listeners to receive data from the Collector in Pinpoint-flink.properties.

  • If you are running a flink cluster, set flink.StreamExecutionEnvironment to server.

  • If you are running flink as a standalone, set flink.StreamExecutionEnvironment to local.

      flink.StreamExecutionEnvironment=server

D. Configure hbase address in hbase.properties which will be used to store aggregated application data.

    hbase.client.host=YOUR_HBASE_ADDRESS
    hbase.client.port=2181

E. Build Pinpoint-flink and run the streaming job file created under target directory on the flink server.

  • The name of the streaming job is pinpoint-flink-job-{pinpoint.version}.jar.

  • For details on how to run the job, please refer to the flink website.

  • You must put spring.profiles.active release orspring.profiles.active local as the job parameter so that the job can refer to the configuration file configured above when running. It must be entered because it uses the spring profile function inside the job to refer to the configuration file.

F. Configure zookeeper address in Pinpoint-Collector.properties so that the Collector can connect to the flink server.

    flink.cluster.enable=true
    flink.cluster.zookeeper.address=YOUR_ZOOKEEPER_ADDRESS
    flink.cluster.zookeeper.sessiontimeout=3000

G. Enable application inspector in the web-ui by enabling the following configuration in pinpoint-web.properties.

    config.show.applicationStat=true

4. Monitoring Streaming Jobs

There is a batch job that monitors how Pinpoint streaming jobs are running. To enable this batch job, configure the following files for Pinpoint-web.

batch.properties

batch.flink.server=FLINK_MANGER_SERVER_IP_LIST
# Flink job manager server IPs, separated by ','.
# ex) batch.flink.server=123.124.125.126,123.124.125.127

applicationContext-batch-schedule.xml

<task:scheduled-tasks scheduler="scheduler">
    ...
    <task:scheduled ref="batchJobLauncher" method="flinkCheckJob" cron="0 0/10 * * * *" />
</task:scheduled-tasks>

If you would like to send alarms in case of batch job failure, you must implement com.navercorp.pinpoint.web.batch.JobFailMessageSender class and register it as a Spring bean.

5. Others

For more details on how to install and operate flink, please refer to the flink website.

Application Inspector

1. κΈ°λŠ₯ μ„€λͺ…

application inspector κΈ°λŠ₯은 agentλ“€μ˜ λ¦¬μ†ŒμŠ€ 데이터(stat : cpu, memory, tps, datasource connection count)λ₯Ό μ§‘κ³„ν•˜μ—¬ 데이터λ₯Ό λ³΄μ—¬μ£ΌλŠ” κΈ°λŠ₯이닀. 참고둜 application은 agent의 그룹으둜 이뀄진닀. 그리고 agent의 λ¦¬μ†ŒμŠ€ λ°μ΄ν„°λŠ” agent inspector ν™”λ©΄μ—μ„œ μ—μ„œ λ³Ό 수 μžˆλ‹€. application inspector κΈ°λŠ₯ λ˜ν•œ λ³„λ„μ˜ ν™”λ©΄μ—μ„œ 확인할 수 μžˆλ‹€.

inspector ν™”λ©΄ μ™Όμͺ½ λ©”λ‰΄μ˜ 링크λ₯Ό ν΄λ¦­ν•˜λ©΄ application inspector λ²„νŠΌμ„ ν΄λ¦­ν•˜κ³  데이터λ₯Ό λ³Ό 수 μžˆλ‹€.

  • 1 : application inspector menu, 2: application stat data

예λ₯Όλ“€λ©΄ AλΌλŠ” application에 ν¬ν•¨λœ agentλ“€μ˜ heap μ‚¬μš©λŸ‰μ„ λͺ¨μ•„μ„œ heap μ‚¬μš©λŸ‰ 평균값 , heap μ‚¬μš©λŸ‰μ˜ 평균값, heap μ‚¬μš©λŸ‰μ΄ κ°€μž₯ 높은 agentid와 μ‚¬μš©λŸ‰, heap μ‚¬μš©λŸ‰μ΄ κ°€μž₯ 적은 agentid와 μ‚¬μš©λŸ‰μ„ 보여쀀닀. 이외에도 agent inspector μ—μ„œ μ œκ³΅ν•˜λŠ” λ‹€λ₯Έ 데이터듀도 μ§‘κ³„ν•˜μ—¬ application inspectorμ—μ„œ μ œκ³΅ν•œλ‹€.

application inspector κΈ°λŠ₯을 λ™μž‘μ‹œν‚€κΈ° μœ„ν•΄μ„œλŠ” flink와 zookeeperκ°€ ν•„μš”ν•˜κ³ , κΈ°λŠ₯의 λ™μž‘ ꡬ쑰와 ꡬ성 및 μ„€μ • 방법을 μ•„λž˜ μ„€λͺ…ν•œλ‹€.

2. λ™μž‘ ꡬ쑰

application inspector κΈ°λŠ₯의 λ™μž‘ 및 ꡬ쑰λ₯Ό κ·Έλ¦Όκ³Ό ν•¨κ»˜ 보자.

A. flink에 streaming job을 μ‹€ν–‰μ‹œν‚¨λ‹€. B. job이 μ‹€ν–‰λ˜λ©΄ taskmanager μ„œλ²„μ˜ 정보가 zookeeper의 데이터 λ…Έλ“œλ‘œ 등둝이 λœλ‹€. C. CollectorλŠ” zookeeperμ—μ„œ flink μ„œλ²„μ˜ 정보λ₯Ό κ°€μ Έμ™€μ„œ flink μ„œλ²„μ™€ tcp 연결을 λ§Ίκ³  agent stat 데이터λ₯Ό μ „μ†‘ν•œλ‹€. D. flink μ„œλ²„μ—μ„œλŠ” agent 데이터λ₯Ό μ§‘κ³„ν•˜μ—¬ 톡계 데이터λ₯Ό hbase에 μ €μž₯ν•œλ‹€.

3. κΈ°λŠ₯ μ‹€ν–‰ 방법

application inspector κΈ°λŠ₯을 μ‹€ν–‰ν•˜κΈ° μœ„ν•΄μ„œ μ•„λž˜μ™€ 같이 섀정을 λ³€κ²½ν•˜κ³  Pinpointλ₯Ό μ‹€ν–‰ν•΄μ•Ό ν•œλ‹€.

A. ν…Œμ΄λΈ” 생성 슀크립트λ₯Ό μ°Έμ‘°ν•˜μ—¬ application 톡계 데이터λ₯Ό μ €μž₯ν•˜λŠ” ApplicationStatAggre ν…Œμ΄λΈ”μ„ μƒμ„±ν•œλ‹€.

B. flink ν”„λ‘œμ νŠΈ μ„€μ •νŒŒμΌ(Pinpoint-flink.properties)에 taskmanager μ„œλ²„ 정보λ₯Ό μ €μž₯ν•˜λŠ” zookeeper μ£Όμ†Œλ₯Ό μ„€μ •ν•œλ‹€.

    flink.cluster.enable=true
    flink.cluster.zookeeper.address=YOUR_ZOOKEEPER_ADDRESS
    flink.cluster.zookeeper.sessiontimeout=3000
    flink.cluster.zookeeper.retry.interval=5000
    flink.cluster.tcp.port=19994

C. flink ν”„λ‘œμ νŠΈ μ„€μ •νŒŒμΌ(Pinpoint-flink.properties)에 job의 μ‹€ν–‰ 방법과 Collectorμ—μ„œ 데이터λ₯Ό λ°›λŠ” listener의 개수λ₯Ό μ„€μ •ν•œλ‹€.

  • flinkλ₯Ό cluster둜 κ΅¬μΆ•ν•΄μ„œ μ‚¬μš©ν•œλ‹€λ©΄ _flink.StreamExecutionEnvironment_μ—λŠ” serverλ₯Ό μ„€μ •ν•œλ‹€.

  • flinkλ₯Ό Standalone ν˜•νƒœλ‘œ μ‹€ν–‰ν•œλ‹€λ©΄ _flink.StreamExecutionEnvironment_μ—λŠ” local을 μ„€μ •ν•œλ‹€.

    flink.StreamExecutionEnvironment=server
    flink.sourceFunction.Parallel=1

D. flink ν”„λ‘œμ νŠΈ μ„€μ •νŒŒμΌ(hbase.properties)에 집계 데이터λ₯Ό μ €μž₯ν•˜λŠ” hbase μ£Όμ†Œλ₯Ό μ„€μ •ν•œλ‹€.

    hbase.client.host=YOUR_HBASE_ADDRESS
    hbase.client.port=2181

E. flink ν”„λ‘œμ νŠΈλ₯Ό λΉŒλ“œν•˜μ—¬ target 폴더 ν•˜μœ„μ— μƒμ„±λœ streaming job νŒŒμΌμ„ flink μ„œλ²„μ— job을 μ‹€ν–‰ν•œλ‹€.

  • streaming job 파일 이름은 pinpoint-flink-job-{pinpoint.version}.jar 이닀.

  • 싀행방법은 flink μ‚¬μ΄νŠΈλ₯Ό μ°Έμ‘°ν•œλ‹€.

  • λ°˜λ“œμ‹œ μ‹€ν–‰μ‹œ job이 μœ„μ—μ„œ μ„€μ •ν•œ μ„€μ •νŒŒμΌμ„ μ°Έκ³  ν• μˆ˜ μžˆλ„λ‘ job parameter둜 spring.profiles.active release or spring.profiles.active localλ₯Ό λ„£μ–΄μ£Όμ•Ό ν•œλ‹€. job λ‚΄λΆ€μ—μ„œ spring profile κΈ°λŠ₯을 μ‚¬μš©ν•˜μ—¬ μ„€μ •νŒŒμΌμ„ μ°Έκ³  ν•˜κ³  μžˆκΈ°λ•Œλ¬Έμ— λ°˜λ“œμ‹œ μž…λ ₯ν•΄μ•Όν•œλ‹€.

F. Collectorμ—μ„œ flink와 연결을 맺을 수 μžˆλ„λ‘ μ„€μ •νŒŒμΌ(Pinpoint-Collector.properties)에 zookeeper μ£Όμ†Œλ₯Ό μ„€μ •ν•œλ‹€.

    flink.cluster.enable=true
    flink.cluster.zookeeper.address=YOUR_ZOOKEEPER_ADDRESS
    flink.cluster.zookeeper.sessiontimeout=3000

G. webμ—μ„œ application inspector λ²„νŠΌμ„ ν™œμ„±ν™” ν•˜κΈ° μœ„ν•΄μ„œ μ„€μ •νŒŒμΌ(pinpoint-web.properties)을 μˆ˜μ •ν•œλ‹€.

    config.show.applicationStat=true

4. streaming job λ™μž‘ 확인 λͺ¨λ‹ˆν„°λ§ batch

Pinpoint streaming job이 μ‹€ν–‰λ˜κ³  μžˆλŠ”μ§€ ν™•μΈν•˜λŠ” batch job이 μžˆλ‹€. batch job을 λ™μž‘ μ‹œν‚€κ³  μ‹Άλ‹€λ©΄ Pinpoint web ν”„λ‘œμ νŠΈμ˜ μ„€μ • νŒŒμΌμ„ μˆ˜μ •ν•˜λ©΄ λœλ‹€.

batch.properties

batch.flink.server=FLINK_MANGER_SERVER_IP_LIST
#`batch.flink.server` 속성 값에 flink job manager μ„œλ²„ IPλ₯Ό μž…λ ₯ν•˜λ©΄ λœλ‹€. μ„œλ²„ 리슀트의 κ΅¬λΆ„μžλŠ” ','이닀.
# ex) batch.flink.server=123.124.125.126,123.124.125.127

applicationContext-batch-schedule.xml

<task:scheduled-tasks scheduler="scheduler">
    ...
    <task:scheduled ref="batchJobLauncher" method="flinkCheckJob" cron="0 0/10 * * * *" />
</task:scheduled-tasks>

batch job이 μ‹€νŒ¨ν•  경우 μ•ŒλžŒμ΄ μ „μ†‘λ˜λ„λ‘ κΈ°λŠ₯을 μΆ”κ°€ ν•˜κ³ μ‹Άλ‹€λ©΄ com.navercorp.pinpoint.web.batch.JobFailMessageSender class κ΅¬ν˜„μ²΄λ₯Ό λ§Œλ“€κ³  bean으둜 λ“±λ‘ν•˜λ©΄ λœλ‹€.

5. 기타

μžμ„Έν•œ flink 운영 μ„€μΉ˜μ— λŒ€ν•œ λ‚΄μš©μ€ flink μ‚¬μ΄νŠΈλ₯Ό μ°Έκ³ ν•˜μž.

Last updated