Skip to content

Commit e902a4d

Browse files
committed
Doc: Complete server monitoring tutorial
1 parent ed5f54c commit e902a4d

9 files changed

+181
-7
lines changed
183 KB
Loading
Loading
Loading
Loading
Loading
Loading
244 KB
Loading
Loading

linode/server_monitoring.md

+181-7
Original file line numberDiff line numberDiff line change
@@ -32,16 +32,19 @@ Log in to your Linode to see available Linodes:
3232

3333
Notice that I have 2 linodes on my account, _official_personal_website_ and _tinkereducationnewsletter_. I will show you how to configure Lassie using the _official_personal_website_ linode.
3434

35-
I will click on it to access more data on the linode. What I am interested in is the "Settings" tab.
35+
I will click on this linode to see more data about it. What I am interested in is the "Settings" tab.
3636
![Settings Tab](/images/linode/server_monitoring/settings.png)
3737

3838
Scroll to the bottom of the "Settings" tab to see the "Shutdown Watchdog" section. Toggle the key to enable this feature.
3939
![Lassie](/images/linode/server_monitoring/lassie.png)
4040

41+
Now, everytime the linode unexpectedly goes offline, then Lassie can return it online automatically.
42+
4143

4244
## Performance of the Server
4345

44-
For vital server and service performance metrics, performance monitoring tools are used. These tools can be equated to a car's dashboard which shows all car performance details such as speed and fuel consumption. We will begin by first looking at the default tools that monitor performance of a server then gradually check out a few more technical tools we can use.
46+
For vital server and service performance metrics, performance monitoring tools are used. These tools can be equated to a car's dashboard which shows all car performance details such as speed and fuel consumption. We will begin by first looking at the [default tools that monitor performance of a server](#linode-cloud-manager) then gradually check out [a few more technical tools](#linux-system-monitoring-fundamentals) we can use.
47+
4548

4649

4750
### Linode Cloud Manager
@@ -79,10 +82,15 @@ Monitoring tools help to reassure us when things are working right, they help us
7982
- Provide administrative data
8083
- Sometimes automate responses to anomalies
8184

82-
Data on each key performance indicator (KPI), network connectivity and application availability is collected and used for analysis. For example, working hardware, available server, server resources are sufficient, no bottlenecks are slowing things down and visualization of data.
85+
Data on each key performance indicator (KPI), network connectivity and application availability is collected and used for analysis. For example, data on working hardware, availability of a server, server resources are sufficient, no bottlenecks are slowing things down and visualization of data.
8386

8487
Thankfully, we have dozens of server system monitoring tools built into Linux. I will show you how to use the `top` command to see avaiable Linux processes in CPU activity order. Understandably, there are a dozen more such as [System Activity Report (sar)](https://linux.die.net/man/1/sar), [Vmstat](https://linux.die.net/man/8/vmstat), [Monitorix](https://www.monitorix.org/), [Nethogs](https://github.com/raboof/nethogs), [Glance](https://nicolargo.github.io/glances/), [htop](https://htop.dev/) and [Netdata](https://www.netdata.cloud/).
8588

89+
The main tools we shall look at in great detail are:
90+
91+
- [Using `top`](#monitor-server-performance-using-top) (Linux)
92+
- [Using LongView](#monitor-server-performance-using-longview) (Linode)
93+
8694

8795
### Monitor Server Performance using `top`
8896

@@ -102,7 +110,7 @@ top - 14:56:17 up 127 days, 22:19, 2 users, load average: 0.01, 0.01, 0.00
102110
```
103111

104112
- The first line contains the **time, the uptime and load averages of the server**. The load average is displayed over 1, 5, and 15 minutes to provide a better overall look at the load my server has undertaken.
105-
- To properly read the load average, we need to know how many CPUs our Linode has. If there is 1 CPU, then a load average of 1.00 eans that the server is operating at its capacity. This number increases to 2 if the number of CPUs is 2, etc.
113+
- To properly read the load average, we need to know how many CPUs our Linode has. If there is 1 CPU, then a load average of 1.00 means that the server is operating at its capacity. This number increases to 2.00 if the number of CPUs is 2, etc.
106114
- A load average of 0.70 for a Linode with 1 core is generally considered a threshold. Anything higher requires reconfiguration of resources or the need to upgrade.
107115

108116

@@ -119,7 +127,7 @@ Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie
119127
- The third line is the **CPU percentages**:
120128
- user CPU time (`us`)
121129
- System CPU time (`sy`)
122-
- Nice time (`ni`) - time spend on low prioity processes
130+
- Nice time (`ni`) - time spent on low prioity processes
123131
- Idle time (`id`)
124132
- Time spent on wait I/O processes (`wa`)
125133
- Time handling hardware interruptions (`hi`)
@@ -209,7 +217,7 @@ Interactively, we can issue the following commands in an active `top` session:
209217

210218
There is [htop](http://hisham.hm/htop/), which is similar to `top`, but offers an easier interface with color, mouse operations, and horizontal and vertical scrolling, making it more intuitive.
211219

212-
To use it, we first need to install it by running th command:
220+
To use it, we first need to install it by running the command:
213221

214222
```python
215223
$ sudo apt install htop
@@ -223,4 +231,170 @@ $ htop
223231

224232
![htop](/images/linode/server_monitoring/htop.png)
225233

226-
You can use your mouse to scroll the interactive process viewer. You can click on a process using yoru mouse to highlight it then press `k`, for example, to kill it. At the bottom, you will notice a few buttons that you can click on.
234+
You can use your mouse to scroll the interactive process viewer. You can click on a process using yoru mouse to highlight it then press `k`, for example, to kill it. At the bottom, you will notice a few buttons that you can click on.
235+
236+
237+
### Monitor Server Performance using Longview
238+
239+
Linode provides a data graphing service called Longview. It does an excellent job of tracking metrics for CPU, memory and network bandwidth, and offers real-timie graphs tha can help expose performance problems. In the following sections, we shall learn how to:
240+
241+
- [Add a Longview Client](#add-a-longview-client)
242+
- [Install the Longview agent](#install-the-longview-client)
243+
- [Access and view our Longview client's data and graphs](#access-and-view-our-longview-clients-data-and-graphs)
244+
- [Longview Data Explained](#longview-data-explained)
245+
- [Uninstall the Longview client](#uninstall-the-longview-client)
246+
247+
248+
### Add a Longview Client
249+
250+
Ensure that you are logged in to your [Linode Cloud Manager](https://cloud.linode.com/dashboard). On the left sidebar, click on the Longview link.
251+
252+
![Longview link](/images/linode/server_monitoring/longview_link.png)
253+
254+
The longview dashboard has two tabs, the Clients tab and the Plan Details tab. Ensure you have selected the Clients tab. Click on the blue "Add Client" button to add a new client.
255+
256+
![Add client](/images/linode/server_monitoring/add_client.png)
257+
258+
I currently have one client installed for my _official_personal_website_ linode. I will be creating a new client for the second linode _tinkereducationnewsletter_.
259+
260+
Once the button is clicked, you will notice that a entry will appear displaying your Longview Client instance along with its auto-generated label, its current status, installation instructions, and API key. Its status will display as "Waiting for data", since we have not yet installed the Longview agent on a running Linode.
261+
262+
![New client](/images/linode/server_monitoring/new_client.png)
263+
264+
The long string appended to the URL `https://lv.linode.com/` is my Linode's Longview Client's instance globally unique identifier (GUID).
265+
266+
267+
### Install the Longview Agent
268+
269+
Now, we need to navigate to our Linode to install the Longview agent to monitor and visualize our system.
270+
271+
```python
272+
$ ssh user@IP_address
273+
274+
# Output
275+
276+
user@project:~$
277+
```
278+
279+
I have logged into my Linode over SSH. You will need to `user` with your actual Linode's user and `IP_address` with your Linode's IP address. I have chosen to log in as a non-root user because it is always advisable not to use root. My `user` has root priviledges. If you are familiar with these, I'd recommend you check out the tutorial [Deploy Your Flask App on Linode](/linode/deploy_on_linode.md).
280+
281+
Back to the Linode Cloud Manager, we need to copy the `curl` command seen in the new Longview client we have just created and paste it on our Linode's terminal.
282+
283+
```python
284+
user@project:~$ curl -s https://lv.linode.com/long-string-url | sudo bash
285+
```
286+
287+
Press "Enter" on your keyboard to execute the command. It will take a few minutes for the installation to complete. You may be asked to accept or deny the autoconfiguration of longview during the installation process. Select "Yes" and press "Enter" to continue with the process.
288+
289+
![Longview autoconguration](/images/linode/server_monitoring/longview_autoconfiguration.png)
290+
291+
This popup occurs when Longview can’t locate the NGINX status page. In turn, this could indicate that the status page is in an unusual and unspecified location, or that the status module isn’t enabled, or that NGINX itself is misconfigured.
292+
293+
Because we clicked "Yes", the Longview tool will attempt to enable the status module, set the status page location in a new vhost configuration file, and restart NGINX. This option is easier, but has the potential to disrupt your current NGINX configuration.
294+
295+
The file can be found in `/etc/nginx/sites-enabled`. Opening this file, we can see the following:
296+
297+
```python
298+
user@project~$ sudo nano /etc/nginx/sites-enabled/longview
299+
300+
# Output
301+
302+
server {
303+
listen 127.0.0.2:80;
304+
server_name 127.0.0.2;
305+
306+
location /nginx_status {
307+
stub_status on;
308+
allow 127.0.0.1;
309+
deny all;
310+
}
311+
}
312+
```
313+
314+
With the installation complete, we can verify that the Longview agent is running:
315+
316+
```python
317+
user@project~$ sudo systemctl status longview
318+
319+
# Output
320+
321+
● longview.service - LSB: Longview Monitoring Agent
322+
Loaded: loaded (/etc/init.d/longview; generated)
323+
Active: inactive (dead)
324+
Docs: man:systemd-sysv-generator(8)
325+
```
326+
327+
This agent is not running. To start it, we use the following:
328+
329+
```python
330+
user@project~$ sudo systemctl start longview
331+
332+
# Nothing will be seen
333+
```
334+
335+
We need to rerun the previous command once again.
336+
337+
```python
338+
user@project~$ sudo systemctl status longview
339+
340+
# Output
341+
342+
● longview.service - LSB: Longview Monitoring Agent
343+
Loaded: loaded (/etc/init.d/longview; generated)
344+
Active: active (running) since Wed 2022-11-30 06:36:43 UTC; 10s ago
345+
Docs: man:systemd-sysv-generator(8)
346+
Process: 634744 ExecStart=/etc/init.d/longview start (code=exited, status=0/SUCCESS)
347+
Tasks: 1 (limit: 1066)
348+
Memory: 205.6M
349+
CGroup: /system.slice/longview.service
350+
└─634749 linode-longview
351+
352+
Nov 30 06:36:43 tinkereducationnewsletter systemd[1]: Starting LSB: Longview Monitoring Agent...
353+
Nov 30 06:36:43 tinkereducationnewsletter longview[634744]: * Starting Longview Agent longview
354+
Nov 30 06:36:43 tinkereducationnewsletter longview[634744]: ...done.
355+
Nov 30 06:36:43 tinkereducationnewsletter systemd[1]: Started LSB: Longview Monitoring Agent.
356+
357+
```
358+
359+
360+
### Access and View Longview Client's Data and Graphs
361+
362+
363+
To see the metrics, let us switch back to the Linode Cloud Manager and reload the Longview page. Occassionally, it may take several minutes for data to load and display in the Cloud Manager.
364+
365+
![Longview data](/images/linode/server_monitoring/double_longview_agents2.png)
366+
367+
368+
# Longview Data Explained
369+
370+
371+
To view the details of a Longview client, let us click the link "View Details".
372+
373+
![View details link](/images/linode/server_monitoring/view_details_link.png)
374+
375+
We will be redirected to the Longview Client's "Overview" tab
376+
377+
![Longview client data](/images/linode/server_monitoring/longview_client.gif)
378+
379+
- The "Overview" tab shows all of your system’s most important statistics in one place
380+
381+
- The "Processes" tab lists all the process currently running on my Linode, along with additional statistics.
382+
- The "Network" tab sorts traffic statistics by network interface available on my Linode.
383+
- The "Disks" tab shows data on the disk Input Output (I/O), the disk space usage and [inode](https://en.wikipedia.org/wiki/Inode) over time.
384+
- The "Nginx" tab (I used Nginx on my Linode) keeps track of NGinx settings, workers and requests, system resource consumption, etc.
385+
- The "Installation" tab has instructions on how to install the Longview agent on a Linode and the client instance API key.
386+
387+
388+
### Uninstall the Longview Client
389+
390+
Back to the Linode Cloud Manager Dashboard, we need to click on the "Longview" link on the left sidebar to list all available client instances.
391+
392+
![Delete client instance](/images/linode/server_monitoring/delete_longview_client_instance2.png)
393+
394+
On the top-right corner of each client instance, there is an ellipsis button. Once clicked, we can see "Delete". At your discretion, you can click on "Delete" to delete the Longview client.
395+
396+
On our Linode, we can run the following command:
397+
398+
```python
399+
user@project~$ sudo apt-get remove linode-longview
400+
```

0 commit comments

Comments
 (0)