<div class="notebook"> <div class="nb-cell markdown" name="md1"> # Display SWISH server statistics ```eval :- if(current_predicate(swish_node/1)). :- swish_node(Node), aggregate_all(count, swish_cluster_member(_,_), Count), format('> This SWISH instance is part of a _cluster_ with ~d members. \c You are now connected to the SWISH backend: __~w__', [Count, Node]). :- endif. ``` This page examines the performance and health of the SWISH server. Most of the statistics are gathered by `lib/swish_diagnostics`, which is by default loaded into https://swish.swi-prolog.org but must be explicitly loaded into your own SWISH server. Part of the statistics are based on reading the Linux =|/proc|= file system and thus only function on Linux. The first step is easy, showing the overall statistics of this server. </div> <div class="nb-cell query" name="q3"> statistics. </div> <div class="nb-cell markdown" name="md2"> ## Historical performance statistics The charts below render historical performance characteristics of the server. Please open the program below for a description of chart/3. </div> <div class="nb-cell program" data-singleline="true" name="p1"> :- use_rendering(c3). %% chart(+Period, +Keys, -Chart) is det. % % Compute a Chart for the given period combining graphs for the given Keys. % Defined values for Period are: % - `minute`: the last 60 1 second measurements % - `hour`: the last 60 1 minute averages % - `day`: the last 24 1 hour averages % - `week`: the last 7 1 day averages % - `year`: the last 52 1 week averages % Defines keys are: % - `cpu`: Total process CPU time in seconds % - `d_cpu`: Differential CPU time (% CPU) % - `pengines`: Total number of living Pengines % - `d_pengines_created`: Pengines create rate (per second) % - `rss`: Resident set size in bytes % - `stack`: Total amount of memory allocated for Prolog stacks in bytes % - `heap`: `rss - stack`. This is a rough estimate of the memory used % for the program, which should stay bounded if the server is free of % leaks. Note that it can still grow significantly and can be temporarily % high if user applications use the dynamic database. % - `rss_mb`, `stack_mb`, `heap_mb` are the above divided by 1024^2. chart(PeriodS, Keys, Chart) :- atom_string(Period, PeriodS), swish_stats(Period, Dicts0), maplist(add_heap_mb, Dicts0, Dicts1), maplist(rss_mb, Dicts1, Dicts2), maplist(free_mb, Dicts2, Dicts3), maplist(stack_mb, Dicts3, Dicts4), maplist(fix_date, Dicts4, Dicts), dicts_slice([time|Keys], Dicts, LastFirstRows), reverse(LastFirstRows, Rows), period_format(Period, DateFormat), Chart = c3{data:_{x:time, xFormat:null, rows:Rows}, axis:_{x:_{type: timeseries, tick: _{format: DateFormat, rotate: 90, multiline: false}}}}. period_format(minute, '%M:%S'). period_format(hour, '%H:%M'). period_format(day, '%m-%d %H:%M'). period_format(week, '%Y-%m-%d %H:00'). period_format(year, '%Y-%m-%d'). add_heap_mb(Stat0, Stat) :- Heap is Stat0.get(heap) / (1024^2), !, put_dict(heap_mb, Stat0, Heap, Stat). add_heap_mb(Stat0, Stat) :- Heap is ( Stat0.get(rss) - Stat0.get(stack) - Stat0.get(fordblks) ) / (1024^2), !, put_dict(heap_mb, Stat0, Heap, Stat). add_heap_mb(Stat0, Stat) :- Heap is ( Stat0.get(rss) - Stat0.get(stack) ) / (1024^2), !, put_dict(heap_mb, Stat0, Heap, Stat). add_heap_mb(Stat, Stat). rss_mb(Stat0, Stat) :- Gb is Stat0.get(rss)/(1024**2), !, put_dict(rss_mb, Stat0, Gb, Stat). rss_mb(Stat, Stat). free_mb(Stat0, Stat) :- Gb is Stat0.get(fordblks)/(1024**2), !, put_dict(free_mb, Stat0, Gb, Stat). free_mb(Stat, Stat). stack_mb(Stat0, Stat) :- Gb is Stat0.get(stack)/(1024**2), !, put_dict(stack_mb, Stat0, Gb, Stat). stack_mb(Stat, Stat). fix_date(Stat0, Stat) :- Time is Stat0.time * 1000, put_dict(time, Stat0, Time, Stat). </div> <div class="nb-cell html" name="htm1"> <div class="panel panel-default"> <div class="panel-heading"> <label>Number of Pengines and CPU load over the past period</label> </div> <p>The number of Pegines denotes the number of actively executing queries. These queries may be sleeping while waiting for input, a debugger command or the user asking for more answers. Note that the number of Pengines is sampled and short-lived Pengines does not appear in this chart. </p><div class="panel-body"> <div class="form-group row" style="margin-bottom:0px"> <label class="col-sm-2">Period:</label> <div class="col-sm-10"> <label class="radio-inline"><input name="period1" value="week" type="radio">Week</label> <label class="radio-inline"><input name="period1" value="day" type="radio">Day</label> <label class="radio-inline"><input name="period1" value="hour" checked="" type="radio">Hour</label> <label class="radio-inline"><input name="period1" value="minute" type="radio">Minute</label> </div> </div> </div> </div> <script> notebook.bindQuery(function(q) { q.run({ Period: notebook.$('input[type=radio]:checked').val() }); }); </script> </div> <div class="nb-cell query" name="cpu"> projection([Chart]), chart(Period, [pengines,d_cpu], Chart). </div> <div class="nb-cell html" name="htm2"> <div class="panel panel-default"> <div class="panel-heading"> <label>Number of threads and local visitors</label> </div> <p>Threads are used as HTTP workers, pengines and some adminstrative tasks. Local visitors are the number of open websockets connected to this node of the cluster, which reflects the number of browser windows watching this SWISH instance. </p><div class="panel-body"> <div class="form-group row" style="margin-bottom:0px"> <label class="col-sm-2">Period:</label> <div class="col-sm-10"> <label class="radio-inline"><input name="period2" value="week" type="radio">Week</label> <label class="radio-inline"><input name="period2" value="day" type="radio">Day</label> <label class="radio-inline"><input name="period2" value="hour" checked="" type="radio">Hour</label> <label class="radio-inline"><input name="period2" value="minute" type="radio">Minute</label> </div> </div> </div> </div> <script> notebook.bindQuery(function(q) { q.run({ Period: notebook.$('input[type=radio]:checked').val() }); }); </script> </div> <div class="nb-cell query" name="visitors"> projection([Chart]), chart(Period, [pengines,threads,local_visitors], Chart). </div> <div class="nb-cell html" name="htm4"> <div class="panel panel-default"> <div class="panel-heading"> <label>Total cluster visitors</label> </div> <p>Cluster visitors aggregates the <em>local visitors</em> over all members of the cluster. </p><div class="panel-body"> <div class="form-group row" style="margin-bottom:0px"> <label class="col-sm-2">Period:</label> <div class="col-sm-10"> <label class="radio-inline"><input name="period2" value="week" type="radio">Week</label> <label class="radio-inline"><input name="period2" value="day" type="radio">Day</label> <label class="radio-inline"><input name="period2" value="hour" checked="" type="radio">Hour</label> <label class="radio-inline"><input name="period2" value="minute" type="radio">Minute</label> </div> </div> </div> </div> <script> notebook.bindQuery(function(q) { q.run({ Period: notebook.$('input[type=radio]:checked').val() }); }); </script> </div> <div class="nb-cell query" name="q4"> projection([Chart]), chart(Period, [visitors], Chart). </div> <div class="nb-cell html" name="htm3"> <div class="panel panel-default"> <div class="panel-heading"> <label>Memory usage over the past hour</label> </div> <p><b>rss</b> is the total (resident) memory usage as reported by Linux. <b>stack</b> is the memory occupied by all Prolog stacks. <b>heap</b> is an approximation of the memory used for the Prolog program space, computed as <i>rss - stack - free</i>. This is incorrect for two reasons. It ignores the C-stacks and the not-yet-committed memory of the Prolog stacks is not part of rss. free is memory that is freed but not yet reused as reported by GNU <a href="https://www.gnu.org/software/libc/manual/html_node/Statistics-of-Malloc.html">malinfo()</a> as <code>fordblks</code>. Note that <code>fordblks</code> is a 32-bit value. The implementation heuristically guesses how many times the value wrapped around and corrects for this. </p> <div class="panel-body"><i><i> <div class="form-group row" style="margin-bottom:0px"> <label class="col-sm-2">Period:</label> <div class="col-sm-10"> <label class="radio-inline"><input name="period3" value="week" type="radio">Week</label> <label class="radio-inline"><input name="period3" value="day" type="radio">Day</label> <label class="radio-inline"><input name="period3" value="hour" checked="" type="radio">Hour</label> <label class="radio-inline"><input name="period3" value="minute" type="radio">Minute</label> </div> </div> </i></i></div><i><i> </i></i></div><i><i> <script> notebook.bindQuery(function(q) { q.run({ Period: notebook.$('input[type=radio]:checked').val() }); }); </script></i></i> </div> <div class="nb-cell query" name="q6"> projection([Chart]), chart(Period, [rss_mb,heap_mb,stack_mb,free_mb], Chart). </div> <div class="nb-cell markdown" name="md6"> ## Health statistics The statistics below assesses the number of *Pengines* (actively executing queries from users) and the *highlight states*, the number of server-side mirrors we have from client's source code used to compute the semantically enriched tokens. If such states are not explicitly invalidated by the client, they are removed after having not been accessed for one hour. The *stale modules* count refers to temporary modules that are not associated to a Pengine, nor to a highlight state and probably indicate a leak. The two queries below extract information about stale modules and threads that have died. These are used to help debugging related leaks. </div> <div class="nb-cell program" data-singleline="true" name="p2"> :- use_rendering(table). stats([stale_modules-Stale|Pairs]) :- aggregate_all(count, pengine_stale_module(_), Stale), findall(Key-Value, (swish_statistics(Stat), Stat =.. [Key,Value]), Pairs). </div> <div class="nb-cell query" name="q7"> stats(Stats). </div> <div class="nb-cell query" data-chunk="10" data-tabled="true" name="q2"> pengine_stale_module(Module, State). </div> <div class="nb-cell query" data-chunk="10" data-tabled="true" name="q1"> swish_died_thread(Thread, State). </div> </div>