카테고리 없음

Analysis of Celestia light node performance

meowmix 2023. 4. 2. 23:22

 

Celestia is the first modular blockchain network. Its mission is to make deploying a blockchain as easy as deploying a new smart contract. Celestia introduces what is called the data availability layer for enabling efficient scaling and allowing L2 rollups to do data sampling for transactions they need.

The Blockspace Race will help put the Celestia network through its paces to harden features and prepare node operators for mainnet launch. Learn more about the design of the program and its importance for the Celestia community at our announcement post here.

 

As one of the 1,000 participants in the Celestia Blockspace Race, I'm really looking forward to this.

Today, we're going to take time to analyze Celestia light-node. This task is Perform analysis of your node, one of the tasks in Blockspace. (https://docs.celestia.org/nodes/itn-node-analysis/)

 

 

Pre-emptive conditions

- Installed Celestia light-node

- Node Monitoring tools

 

Since this post is not about light node setting-up, I will replace it with a tutorial link.

 

However, since we are going to perform a hardware analysis related to node operation, we need to keep in mind the hardware conditions required by the node.

 

 

Hardware requirements


The following minimum hardware requirements are recommended for running a light node:

Memory: 2 GB RAM
CPU: Single Core
Disk: 25 GB SSD Storage
Bandwidth: 56 Kbps for Download/56 Kbps for Upload

 

For stable node execution, I am running a hardware with a slightly higher specification than the minimum specification.


Memory: 4 GB RAM
CPU: Triple Core
Disk: 80 GB SSD Storage
Bandwidth: Over 500 Kbps for Download & Upload

Many vps providers offer graphical server analysis.

 

However, this post will be analyzed using Grafana + Prometheus for those who want to analyze information that the vps do not provide and those who run the node directly.

 

 

Setting-up node monitoring tools

 

*This tutorial is based on https://grafana.com/docs/grafana-cloud/quickstart/noagent_linuxnode/

 

First of all, Navigate https://grafana.com/products/cloud/ and create account, get api.

 

Install and run node_exporter on the node

wget https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz

Extract the node_exporter binary

tar xvfz node_exporter-1.5.0.linux-amd64.tar.gz

Change to the directory created during extraction

cd node_exporter-1.5.0.linux-amd64/

Make the node_exporter binary executable

chmod +x node_exporter

Run the node_exporter binary

./node_exporter

Test that metrics are being exported on port 9100

curl http://localhost:9100/metrics

 

If you see metrics on your screen, all is well. If not, check your steps for typos, make sure the binary is executable, and whether curl works with other URLs.

 

If you don’t want to start node_exporter directly from the command line, you can create a systemd service for it, similar to Creating a systemd service to manage the agent.

 

Next, we will download and install Prometheus on the node to scrape the metrics being provided by node_exporter and send them to Grafana Cloud.

 

Download the Prometheus compressed package

wget https://github.com/prometheus/prometheus/releases/download/v2.43.0/prometheus-2.43.0.linux-amd64.tar.gz

Extract the binary

tar xvf prometheus-2.43.0.linux-amd64.tar.gz

Change to the directory created during extraction

cd prometheus-2.43.0.linux-amd64

 

Create a configuration file for Prometheus, so that it can scrape the metrics and to send them to Grafana Cloud. This configuration file has many options. For our example, it only needs three sections:

 

  • "global" is the section into which configurations common across all Prometheus actions are placed. In this example, we set the scrape_interval for checking and grabbing metrics from configured jobs to happen every 15 seconds.
  • "scrape_configs" is where we name our job; this name will be used in Grafana to help you find associated metrics. It is also where we configure Prometheus to find the metrics for that job.
  • "remote_write" is where we instruct Prometheus to send the scraped metrics to a secondary endpoint.
    Edit this file to include your Grafana Cloud username and the API key you created earlier.

 

To confirm your username and URL, first navigate to the Cloud Portal, then from the Prometheus box, click Send Metrics.

 

 

Create a Prometheus configuration file named "prometheus.yml" in the same directory as the Prometheus binary with the following content. You must replace the value.

 

global:
  scrape_interval: 60s

scrape_configs:
  - job_name: node
    static_configs:
      - targets: ['localhost:9100']

remote_write:
  - url: '<Your Metrics instance remote_write endpoint>'
    basic_auth:
      username: 'your grafana username'
      password: 'your Grafana API key'

 

Run prometheus

./prometheus --config.file=./prometheus.yml

If you don’t want to have to start Prometheus directly from the command line every time you want it to run, you can create a systemd service for it, similar to Creating a systemd service to manage the agent.

 

Check that metrics are being ingested into Grafana Cloud

 

Within minutes, metrics should begin to be available in Grafana Cloud. To test this, use the Explore feature. Click Explore in the sidebar to start. This takes you to the Explore page, which looks like this.

 

At the top of the page, use the dropdown menu to select your Prometheus data source.

Use the Metrics dropdown to find the entry for node, which is the "job_name" we created in "prometheus.yml".

 

If node is not listed, metrics are not being collected. If metrics are listed, this confirms that metrics are being received.

If metrics are not displayed after several minutes, check your steps for typos, make sure the binary is executable, and whether Prometheus is running on the Linux machine.

 

Import a dashboard

 

Official and community-built dashboards are listed on the Grafana website Dashboards page.

 

Dashboards on this page will include information in the Overview tab about special configurations you may need to use to get the dashboard to work. For our example, we require a dashboard that is built to display Linux Node metrics using Prometheus and node_exporter, so we chose Linux Hosts Metrics | Base. Note the ID of the dashboard: 10180. We will use this ID in the next step.

 

  1. In Grafana, click Dashboards in the left-side menu to go to the Dashboards page.

  2. Click New and select Import in the dropdown.

  3. Enter the ID number of the dashboard we selected.

  4. Click Load.

 

You’ll get a dashboard like this.

 

 

Now we can perform Celestia ligh-node anaysis on grafana dashboard.

 

 

Performance Analysis

 

The measurements were made for approximately 15 days from March 31.2023 ~ April 15.2023 in CPU, memory, network traffic, and disk areas, and the stable operation of the nodes enabled smooth measurements.

 

 

Click image to enlarge

 

As mentioned earlier, the node was able to operate smoothly with very low loads because it had a slightly higher specification than the minimum specification to pursue the stability of the node.

 

On April 1 & 12, there was a change in all measurements because the node was interrupted and restarted for the stress test of the figure.

 

 

Analyze in more detail

 

Click image to enlarge

The average cpu load was 0.8%, which is encouraging considering that the tool for node measurement is currently in operation together.

The cpu load increased when the node was restarted, reaching a maximum of 2.61%, but this is a very small number compared to the total usage and can be seen as putting a very small burden on it.

 

 

Click image to enlarge

Before analyze memory, we have to find out the memory usage of the performance measurement tool.


When a node is shut down, the memory free usage increases by about 0.8GB, indicating that the node performance measurement tool takes up about 0.8GB of memory usage.

 

Then, can we now recognize the memory usage of celestia-node? We can see that about 0.8 to 1.2GB of memory is used out of a total of 4GB of memory.

This number is also a relaxed number, as is the CPU load.

 

 

Click image to enlarge

Network traffic records a maximum of in 17.3 MB and out -9.61 MB at node restart, showing an average value of 100~200 kb.

 

 

Click image to enlarge

Disk space was not as free as expected, it showed about 4GB of usage over 15 days. The node can be operated for about three months with 25GB of disk space, which is the minimum hardware requirement.

 

Of course, it can't mean much because it's currently a testnet period, but it's expected that more disk space will be needed to operate the mainnet node long term.

 

 

Click image to enlarge
Click image to enlarge

 

Both disk activity & io figures show uniformity on average except for node restart. 

 

Disk activity recorded reads of around 300 kb and writes of around 100 kb. In addition, disk io also showed read (sda) of around 100µs and write levels of around -2.7ms.

 

 

Conclusion deduction

 

It has been confirmed that there is no problem driving the Celestia light node with minimal specification. However, it is worth considering more disk space for long-term operations.

 

State that this analysis post is subjective and may appear differently depending on the hardware performance of the measurer.

 

If you want to contact me, please contact

 

Discord: dricake#2574

Github: https://github.com/osrm

 

Thank you!