Google summer of code notes
Performance Co-Pilot timeseries are series of time-stamped values gathered from hosts making performance
data available. With pmseries query, users can obtain various information about performance
metrics in real time or from historical data. The existing pmseries query language implementation is
maturing. However, there are still some important query grammar and functions needed to be added. This
proposal aims to extend some time series functions in the libpcp_web module, including extending the
grammar to support scalar operands in expressions, extending the cross-domain operations, implementing
new statistical functions, and implementing new functions for time series sample matching.
Pull Request
- Time domain operation for max PR #1611
 - libpcp_web: add pmseries time domain functions PR #1623
 - libpcp_web: solve memory leak PR #1628
 - libpcp_web: modify libpcp_web make file PR #1630
 - libpcp_web: Pmseries nth percentile PR #1637
 - libpcp_web: pmseries language extensions with topk functions PR #1638
 - pmseries: scalar multiplication PR #1681
 
Multi-hosts monitoring set up
Follow Record metrics from a remote system to set up the remote system (can be a virtual machine on the same computer).
On your local host,
- 
    
use
sudo vi /etc/hoststo edit the hosts on your machine and add the new system mapping to the file. - go to 
/etc/pcp/pmlogger,- 
        
The control
filecontains one line per host to be logged. - 
        
The file
control.dstores the config of each host 
 - 
        
 - Copy the local file and give it a name for your remote system. set n to primary option, which means this remote system is not primary, and your local machine should be primary system.
The arguments for the hosts are
    
-r: creates the local config-T: terminating cycle-c: config file for pmlogger-v: volume size. Once the archieve meets the set volume, a new archieve will be created.
 - 
    
Use
sudo systemctl restart pmloggerto restart pmlogger, and usesudo systemctl status pmloggerto check its status. - Go to the dir 
cd /var/log/pcp/pmlogger/shiyao_fedorawhere stores the config of your second system to see details.- Files 20220726.15.27.* have all the metrics sent from redis. Once this file meets the set volume (the -v option), a new one with the same prefix will be created, and new data will be stored to the new files. and the old files will be compressed.
 - File 20220726.15.27.index is a lookup table for the previous files. It’s used for a quicker data query.
 - File 20220726.15.27.meta is the metadata from redis, it stores metric names, descriptors, labels and so on.
 - File Latest is a PCP archive folio
 - File pmlogger.log is created by -r option. It stores the query frequency for each metric.
 
 - 
    
We can use
sudo systemctl stop pmproxyto stop the new messages from redis, and usesudo systemctl restart pmproxyto restart the pmproxy and allow new messages from redis server. - We can use 
pmseries -a 805f4cdf368337dd564c365909543cc86a39275eto see where the data come from (local or remote). 
Early June (Week 1 & Week2)
- Set up testing environment
    
- Use the following commands to check which .so is linked
        
which pmseries ls -l /usr/local/lib ls -l /usr/lib/libpcp* ldd /usr/local/bin/pmseriesand use
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/libpcp_web.so.1 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/libpcp_web.so.1to link to the correct .so.1 file
 
 - Use the following commands to check which .so is linked
        
 - finsed time domain operations for 
min()andmax()functionsNotes
 - 
    
Use
sudo ./check -g pmseriesto run all pmseries tests, or usesudo ./check 1886to run a specific test. - If a function is implemented, remember to
    
- run tests,
 - update man pages, and
 - run valgrind –leak-check=full
 
 - To update man pages, go to 
/pcp/man/man1/pmseries.1file and runman ./pmseries.1once new function descriptions are added - Use 
gdb --args pmseries "..."to debug 
Late June (Week 3 & Week4)
- Implemented time domain operation: 
sum_sample()andavg_sample() - Implemented time domain and instance domain operations for standard deviation, i.e. 
stdev_inst()andstdev_sample().Notes
- Remember to update
        
np->value_set.series_values[i].series_desc.type np->value_set.series_values[i].series_desc.semantics np->value_set.series_values[i].series_desc.unitsif any operation is done to the original redis data. Checkout pmSemStr and pmUnitsStr for more information.
 - Try to test on multi-host environment: Record metrics from a remote system
 
 - Remember to update
        
 
Early July (Week 5 & 6)
- Implemented operations for 
topk_inst()andtopk_sample().Notes
 - use 
./newto create a new qa tests under the qa folder. 
Late July (Week 7 & 8)
- Implemented 
nth_percentile_inst()andnth_percentile_inst().Notes
- HdrHistogram_c provides examples for histogram function. HDR stands for high dynamic range.
 - bpftrace provides examples of histogram output.
 
 
Early August (Week 9)
- Implemented scalar multiplication and its tests
    
- Note: does not have overflow handling. Current method is to report error when overflow occurs.
 
 
Early September (Week 10 & 11)
- Understood callback for creating histogram bar chart.
    
- pcp/src/include/pcp/pmwebapi.h: added a structure to store histogram values.
 - pcp/src/pmseries/pmseries.c: added call back for on_histogram_value.
        
Notes
 - Try to use HdrHistogram and it can be one of the vendors for pcp.
 
 
Late September (12 & 13)
Special notes
What if qa fails
Before running ./check ..., run pmseries --load "{source.path: \"PATH/pcp/qa/archives/proc\"}".
- If unable to connect to redis server with the error msg ‘Segmentation fault (core dumped)’, try to run 
sudo make cleanand rebuild the project. This should solve the segmentation fault problem. 
Early October (14 & 15)
- Implemented histogram() function: created a new callback structure to send histogram data.
 - Understand the timeseries sample matching problem, and create graphs to see metric data trends.
 
Special notes
Change time period of a metric:
- go to /var/lib/pcp/config/pmlogger/config.default
 - add a section for the metric with new period, such as
    
log advisory on 2sec { disk.all.read }By doing so, the report period of
disk.all.readwill be change to 2 seconds instead of 10 seconds. 
Late October (16 & 17)
- Designed algorithm to do upsampling and interpolations of vector operands to match timeseries samples with other vector operands.
 
Special notes
- Figured out the usages of two 
timing_tin bothnode_tandseries_tstructures.timing_tinnode_tis the time periods (delta\intervalin the pcp time series query expression) for the series root node.timing_tinseries_tis the time intervals for each child roots.
 
TODO: After GSOC
- Keep implementing the Timeseries Sample Matching Function.
 - Solve some memory leak problems.