Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For an explanation on how to define instance groups, see the Precise Installation Guide. The table below describes the information displayed in the Instance Grouping table.

Table 8- 1 Instance Grouping table

ItemDescription
GroupDisplays the name of the group.
InstancesDisplays the number of instances linked to the group.
Engine UtilizationDisplays the average Engine Utilization for the instances in the specified group.
Committed TransactionsDisplays the SUM number of Committed Transactions for the instances in the specified group.
Connections Opened (Sum)Displays the sum of opened connections for the instances in the specified group.

About viewing instances associated with an Tier in the Association area

Displays information on the Instances associated with the selected Tier in the Association area. The table below describes the information displayed in the tabs in the Association area.

Table 8- 2 Viewing information on associated instances

TabDescription
OverviewDisplays counters reporting on CPU Usage, Memory Usage and amount of Locked Data.
NetworkDisplays counters reporting on the number of times an engine polls incoming and outgoing packets and bytes.
Disk I/ODisplays various counters reporting on the amount of disk I/O.
TransactionsDisplays counters reporting on the number of rows changed in the instance, grouped by type (insert, update, and delete).

Anchor
AbouttheInstanceentity
AbouttheInstanceentity
About the Instance entity

...

The Replication Inbound view is a combination of a few predefined graphs related to the inbound processing phase (replication agent); the view helps you identify what may be causing the latency in the replication system. The information displayed in the Instance level is a summation of all the replicated databases of the instance.

Table 8- 3 I/O vs Cmds Sent

GraphDescription
IO vs. Cmds Sent

Highlights

The IO vs. Cmds Sent graph shows the number of commands processed by the RepAgent during the given period and the amount of time the RepAgent had to wait while writing to the inbound queue. Each bar represents a time period. The speed of the RepAgent is directly proportional to the speed of ASE processing the network that sends requests coupled with the speed of the Replication Server processing.

What to do next

  • If the amount of I/O wait is significant (compared to the total time each bar represents), then the specific issue delaying the RepAgent must be isolated. The amount of time a RepAgent will have to wait while writing to the inbound queue depends on how much cache is available (exec_sqm_write_request_limit) as well as the values for init_sqm_write_delay/init_sqm_write_max_delay. For more information, check the online help available on the Sybase Web site.
  • If you see a shift in the trend of the Cmds Sent (fewer commands are sent even though the activity is the same), then perhaps something has changed in the ASE processing. Examine the Activity tab and see what was the ASE doing at that time and what, if anything, was holding it back.

Table 8- 4 Truncation Point Movement

GraphDescription
Truncation Point Movement

Highlights

The Truncation Point Movement graph shows the number of times the RepAgent has asked the RS for a new secondary truncation point and then moved the secondary truncation point in the log. If 'Moved' is more than one less than 'Gotten', the likely cause is that a large or open transaction exists from the Replication Server perspective (either it is indeed still open in ASE or the RepAgent has not forwarded the commit record yet).

What to do next

  • Check for an open transaction in ASE.
  • To determine if the numbers are high - you can gauge the number by dividing the 'Replicated Log Records' (which can be seen in the Log Scan Summary graph) by the RepAgent 'scan_batch_size' configuration parameter and add one to count for the additional move when it reaches the end of the log. The number should be close to what is shown.

Table 8- 5 Network

GraphDescription
NetworkHighlights

The Network graph shows information related to the network activity. If the Number of Full Packets is a considerable percentage out of the total package sent (>= 90%), then you may need to consider increasing the send_buffer_size.

What to do next

  • Examine the send_buffer_size configuration parameters in the Objects tab.

Table 8- 6 Log Scan Summary

GraphDescription
Log Scan Summary

Highlights

The Log Scan Summary graph is an indicator of how much work the RepAgent is doing and how much information is being sent to the Replication Server. Replicated log records are records converted into LTL and sent to the RS. Not replicated log records are all log records that were scanned but not sent.

What to do next

  • If the replicated log records number is small compared to the total log records scanned, it means that most of the transactions in the log are not aimed for replication — so a large part of the log is scanned needlessly. Consider moving the few tables that need to be replicated to another database so that the total log records scanned will be smaller.

Table 8- 7 Replicated Commands Breakdown

GraphDescription
Replicated Commands Breakdown

Highlights

The Replicated Commands Breakdown lists the number and type of commands that were replicated and which most affect the replication.

  • DDL. Refers to DDL statements that were replicated. Generally this number should be zero, with only minor increases in a Warm Standby when DDL changes are made.
  • Writetext. Shows how many Writetext operations are being replicated.
  • DML with Text/Image. Displays how many row images are processed.

What to do next

  • If a large number of text rows are being replicated, you may want to investigate whether a text/image column was inappropriately marked or left at "always_replicate" instead of "replicate_if_changed".

About viewing Replication Outbound view graphs

The Replication Outbound view graphs view is a combination of a few predefined graphs related to the outbound processing phase. This enables you to understand what specific component(s) may be causing the latency in the replication system. The information displayed in the instance level is a summation of all the outbound processing performed for the specified instance by all the Replication servers attached to it.

Table 8- 8 SQM Processing

GraphDescription
SQM Processing

Highlights

The SQM Processing graph shows the space usage of SQM queue overtime versus the number of commands written to the queue. The SQM Commands Written indicates the amount of activity performed by the replication server.

What to do next

  • Check the percentage used of the SQM and see if it is sufficient, over-allocated, or near maximum space capacity to accommodate the written commands.
  • Examine the trend of written commands, with special attention to trend changes. Is the activity evenly distributed, or does it peak at certain times? A deviation from normal trends can be acceptable, but it may also indicate a replication activity that is unauthorized or poorly defined.
  • If the number of commands are lower, then maybe there is a problem in inbound queue processing or in the network. Check the replication inbound statistics.
  • If the amount of free space is lower and there is no change in the number of written commands, this may indicate that the subscriber was unable to apply the transaction. For example, an insert statement may have failed because of a duplicate key. If this example occurs, the truncation point will indicate this failed transaction and will not proceed until it is cleared manually.

Table 8- 9 DSI SQT Processing

GraphDescription
DSI SQT Processing

Highlights

The DSI SQT Processing graph shows space usage of the outbound queue cache.

The Cache size lists the size of the SQT cache. Cached Used shows SQT thread memory usage over time. Each command structure (allocated by an SQT thread) is freed when its transaction context is removed. Consequently, if no transactions are active in SQT, then SQT cache usage is zero.

If Cached Used is near the maximum cache size capacity and Trans Removed is constantly greater than zero, you may decide to increase the SQT cache size by increasing dsi_sqt_max_cache_size.

Don’t increase the cache size if Trans Removed is occasionally greater than zero; this situation usually means that a transaction was removed from the cache because it was greater than the dsi_sqt_ max_cache_size. For occasional increases of Trans Removed, increasing the cache size will have the opposite effect because the latency in processing transactions ahead of them will likely result in their being removed.

The SQM Read Cached indicates how many 16K blocks of cache were read by the SQM Reader thread. This number should be as high as possible. If it is constantly hitting zero, most likely there is a latency in executing the SQL at the replicate; resulting in the DSI SQT cache to reach its maximum.

What to do next

  • If Cached Used is near the maximum cache size capacity and Trans Removed is constantly greater than zero, you may decide to increase the SQT cache size by increasing dsi_sqt_max_cache_size.
  • For occasional increases of Trans Removed, increasing the cache size will have the opposite effect because the latency in processing transactions ahead of them will likely result in their being removed.
  • Examine dsi_sqt_max_cache_size and sqt_max_cache_size. The sqt_max_cache_size should usually be between 4-8 MB. For non-parallel DSI configuration and no Warm StandBy, the dsi_sqt_max_cache_size should be smaller than the sqt_max_cache_size. If set too high (or left at zero as default), the DSI-S thread will continuously try to fill the available DSI SQT cache from the outbound queue - often at the expense of yielding the CPU to the DSI EXEC. Consequently, the optimum size for dsi_sqt_max_cache_size is between 1-2 MB.
  • For Warm StandBy Configurations, the DSI SQT cache should normally be equal to the SQT cache or higher because it is the DSI SQT thread that is actually sorting the transactions into commit order.
  • For parallel DSI configurations, the DSI SQT cache should contain double the dsi_max_xacts_in_group transactions for each DSI EXEC thread. The DSI EXEC thread can effectively process many row modifications because the load can be distributed among several available DSIs. This can result in a situation where the DSI transaction rate is higher than the amount of rows read from the outbound queue. In these situations, raising the DSI SQT cache allows the DSI to “read ahead” into the queue and begin preparing transactions before they are needed.

Table 8- 10 DSIE Transactions Processing

GraphDescription
DSIE Transactions Processing

Highlights

The DSIE Transactions Processing graph shows how long it took to process a transaction by a DSI/E thread breakdown to various phases.

  • FS Map Time. The amount of time used to translate the replicated row functions into SQL commands. If a large amount of time is spent in this area, this may indicate that there are relatively large customized function strings; if true, there may not be any options you can take. However, you may want to check that the STS cache is sized appropriately.
  • Sent Time. This represents the amount of time spent sending the command batch to the replicate data server. A long lag time may indicate inefficient batching, or a slow response to client applications from the replicate server.
  • SendRPCTimeAvg. This records the time spent sending RPCs to the RDB.
  • SendDTTimeAvg. This records time spent sending LOB data to the RDB.
  • Result Time. This calculated value can be used to determine the amount of time spent processing results from the replicate server. This value also includes the execution time, because RS does very little result processing. Often, these metrics will be among the highest. You need to speed-up the replicate DBMS to improve the RS throughput.
  • Commit Seq. Time. The amount of time spent waiting to commit. Here too, a high value may indicate a near-serial dsi-serialization_method, such as wait_for_commit. Alternatively, it may indicate a contention within the replicate server, possibly within the rs_threads_group.
  • Batch Seq. Time. This is the time spent trying to coordinate the sending of the first batch in parallel DSIs. A long time may indicate that the dsi_serialization_method is wait_for_commit and a previous transaction is running a long time, or that the DSI thread is simply too busy to respond to the Batch Sequencing message.
  • Other Time. The execution time by the replicate database outside of the replication server.

What to do next

  • If Other Time is the largest chunk of time out of the total transaction time, tuning RS will not help.You must do one of the following: tune the replicate database by using parallel DSIs, or use minimal columns/repdefs to reduce the SQL execution time.
  • If FS Map Time is high, consider rewriting the queries and simplifying them, or work with a stored procedure.
  • If Commit Seq. Time is high, consider lowering the number of parallel threads.

Table 8- 11 DSIE Commands Applied

GraphDescription
DSIE Commands Applied

Highlights

The DSIE Commands Applied graph shows how many commands were successfully applied to the target database by DSI/E overtime (relevant for version 15 and higher). It can be used to measure the amount of work imposed by the replication server. You can also compare that number to the SQM Commands Written and check if there is a lag in processing commands inside the replication server.

What to do next

  • Study the trend of applied commands. If it has suddenly dropped, this may indicate a problem in the subscriber ASE.
  • Go to the Activity tab and examine the resulting batches from replication (rep server program).

Anchor
AbouttheEngineentity
AbouttheEngineentity
About the Engine entity

...

The table below describes what the utilization status of an engine indicates.

Table 8- 12 Utilization status

Engine BusyCPU YieldsStatus
LowLowEngine is CPU starved
LowHighEngine is inactive
HighLowEngine is busy
HighHighEngine is busy

To improve kernel utilization, it is recommended to check the Disk I/O and Network view. This is done to determine if the number of checks for I/O or Network that the engine performs, is ideal or are overhead.

...

  1. Select the associated required performance group from the Associations list.
  2. Select the required counter.

 

...

  1. .

...

 

 

...