Fundamentals of technical literacy for young digital project managers. Part 2

Details: Tech; Blog; 08 March 2023; Hits: 90

Author: Qian Si Ying

What are the servers and how to configure them, why backup is needed, as well as briefly about how email and SSL certificates work

In the last episode: we talked about how the Internet works, why IP addresses and DNS servers are needed, drew diagrams and understood frameworks and CMS. If you missed it, better come back and reread it. And today a new portion of technical subtleties in simple language. By the way, you can watch a video instead of reading (and even see in real time how "programmers program").

Servers: what are the types and what is the difference

To begin with, we remind you: hosting is a service for providing resources for placing files on a server that is constantly online. But on which server exactly — there are options: Shared hosting (virtual), VPS (Virtual Private Server) — virtual dedicated server and VDS (Virtual Dedicated Server) — virtual private server.

Shared hosting is like an apartment in a communal apartment: the hoster sets up everything for you, and almost everything you can do there depends on his goodwill. Allows you to configure something yourself — you configure, does not allow — well, sorry. That's why we almost never use them in the studio.

Shared hosting may be preferable if the client does not have at least one specialist who could set up a VPS server. Plus, he has a graphical settings panel: due to it, a non-specialist can cope with him. But it also interferes with its settings when you need to add additional software.

Virtual private and dedicated servers are practically the same in terms of configuration options: you can install the necessary software. On Shared hosting, if you need to put a PDF document generator on the site, it won't work, if you want to change the search engine to Sphinx there, it probably won't work either. But on VPS and Dedicated — it's easy.

The differences between the last two types: VPS is like an apartment in a high—rise building, and Dedicated is like its own house. Dedicated servers are large and powerful in terms of resources, but it is impossible to expand them just like that (it will not work to squeeze the territory from a neighbor - only to buy a new plot for development).

With a VPS server, due to the fact that it is virtual, you can almost always switch to a more expensive tariff with more resources, while you do not have to reconfigure anything on the server itself (replenishment in the family - rent an apartment in the same high-rise building, only bigger). Although there are exceptions — the power increase can be done only within one line of tariffs and without changing the virtualization technology (this is an engine that allows you to simulate multiple operating systems on one server). Most often, VPS is used on small and medium-sized projects, and on large-scale ones.

A virtual dedicated server is when one large and very powerful computer is taken, and its power is divided into several parts: one part is sold to one client, another to another, and the third to a third. If someone wants to increase the capacity, he pays extra, and his part increases. If someone does not fit into the resource of one dedicated server, then they are simply transferred to another server.

Dedicated server is a real "hardware": a computer that the hoster has in his hosting center. The hoster provides him with uninterrupted power, he has no problems with the Internet, and only you use this server. This means that it will not be possible to increase its resources just like that.

To choose a suitable server, you need to estimate the approximate number of visitors and the approximate amount of resources that will be required for the site to work. In addition, cloud servers are very popular — a unified system of servers on which the project is located: the allocated capacity is not limited to one server, but is distributed among several servers at once. They are usually contained by large, technological market players like Amazon.

Plus the cloud server — most often you can work with it through the API: it provides an interface, and you can write the necessary script to regulate the load on the server. For example, the script will automatically run additional 1-2-3 servers depending on the time of day for faster processing of user requests (if the number of site users varies greatly during the day).

Minus cloud servers — they are more expensive than VPS. Billing is often hourly, and the cost of working 24 hours a day for a month is 2-3 times more expensive than for similar VPS servers. But when you don't need additional resources, you don't pay for them. But in general, the cloud turned out to be 2-3 times more expensive than having our own server (we are talking about quite powerful configurations with a lot of traffic).

As projects grow and develop, it happens that server resources begin to run out: pages slow down, during peak hours the server cannot cope with the load and may be unavailable. In this case, we look at what exactly is slowing down the server. If we run into a resource, we choose what is cheaper: buy a new more powerful server or optimize the project code. We are looking for a bottleneck, we are trying to split it. If you have already optimized everything you could, it remains to expand the resource only at the expense of additional hardware.

When moving to a new virtual server, everything is simple: you buy a more powerful tariff, and a resource is added to you. With a Dedicated server, such a number will not work: you will have to buy a new separate server, configure it from scratch, copy data from the old server to the new one (and this is no longer a test, but all the real content from the current site: billions of images, a huge database, and so on). But all this takes time.

There is an indicator of SLA — the percentage of server availability time. For websites, it is usually not very critical: if the online store is unavailable for an hour a month, it is unpleasant, but not critical. But if among the requirements for the project there is 99.97% availability (and this is only a few minutes per month), even with a relatively low load, you need to provide additional servers to ensure reliability. It is better to approach this comprehensively and think through all possible scenarios for the development of events at once: for example, instead of two VPS servers from one hoster, take two from different ones.

There are several fundamentally different strategies for increasing server capacity. You can move separate services to separate servers (for example, move the database to a separate server, but it is important not to lose through the communication channel to the other servers). And you can process users on different servers using a balancing mechanism. In addition, if the database server capacity becomes insufficient, it can also be distributed across several servers (for example, using the sharding mechanism).

Server Control Panels

We don't really respect hosting control panels (like an ISP manager). Yes, they facilitate basic setup, but they greatly interfere with fine-tuning. Most often, the software is installed on the server using configuration files (text files with the configuration of different programs). The graphical interface (server control panel) is a layer that supposedly facilitates the sending of these configuration files. But usually "facilitates" means "hides the incomprehensible." Yes, beginners need a simple tool. For the pros, this is not enough.

If we want to make some fine-tuning, they probably won't be in the control panel. It will be possible to fix the configuration files one-time, but with any change in the panel, our changes will most likely disappear.

If for some reason the panel is still needed by the client, it needs to be clarified in advance. To understand the expediency, if it is not advisable — to discuss and convince that there is no such need. Some customers do not look to buy hosting and immediately access the control panel to it. The same ISP manager costs some money. It is important that the customer does not accidentally enter it and erase all the settings with one click.

Configuring the Server

To configure the server, you need full administrative access to it. In Linux, the administrator is called the root or root user. Accordingly, when buying a server, the customer must give the developer the login and password from the root user and the address of the server itself so that the developers know where to connect. Access to the server can be by password and by key.

And so we get a clean server. First, you need to make sure that he responds to some domain, and when accessing it, opens the site. During setup, all the necessary software is usually installed, the database and optimal settings for the management system or framework are prepared, and separate accesses are created: SSH access to connect and upload the program code, and access to the database to upload site data.

We prefer to do server setup ourselves and don't really like pre-configured servers. Historically, one of the most popular operating systems is Debian (or Ubuntu as a branch of Debian) with an already selected set of scripts that we use on a clean server to automatically install the necessary software and create configurations and accesses. The advantage of the self-configuration approach is also that our set of scripts for configuration is centrally updated if some critical vulnerability comes out. Although the question of choosing an operating system ("which is better") is the same as with which keys to change the layout.

We prefer to do server setup ourselves and don't really like pre-configured servers. Historically, my favorite is the Debian operating system (or Ubuntu as a branch of Debian) with an already selected set of scripts that we use on a clean server to automatically install the necessary software and create configurations and accesses. The advantage of the self-configuration approach is also that our set of scripts for configuration is centrally updated if some critical vulnerability comes out. Although the question of choosing an operating system ("which is better") is the same as with which keys to change the layout.

Bitrix, realizing that users launch sites "as they have to", prepared a set of scripts and a virtual BitrixVM server with already configured parameters. It is well configured for typical tasks, and if you strengthen it with your own additional modules, you get a fire at all: both quickly and clearly. Most often, we supplement it with modules for generating PDF files and the Google Pagespeed module to improve performance (optimizing images and some scripts).

Deploying the site code to the server

To begin with, of course, you need a configured server. Then, when the Bitrix quality monitor (a tool for checking the quality of the project) has already been passed, the programmer copies the code and the core of the site to the working server, lays out the database (the code and database are laid out, as a rule, by the deployment script; deployment is the procedure for deploying code on the server). At the same time, a backup copy of the site is created, which is transferred to a configured server, where it is automatically deployed by a script.

Then the programmer checks that everything is correctly transferred to the production server, checks all the integrations and the operation of the installed modules — and, in general, the initial layout of the project ends there.

With further calculations, everything is a little more complicated. When we make some improvements, most often changes are made to the database: new information blocks or fields are added to the order, integrations with some services are configured. At the same time, the database on the already laid out site is not static: the customer downloads the content, and if it is an online store, real orders appear there. This data cannot be lost in any case, so you can't just take and copy the database completely from the test server to the working one.

The mechanism called "database migration" saves — the programmer must describe in code what changes should be made to the database. There are automated tools for creating migration across database tables or infoblox (we have a migration module in a clean build). But migration is not so simple. For example, it is very labor—intensive to transfer the settings of payment systems in Bitrix through migrations - it is more expensive to write a full-fledged migration of resources than to do it manually. If someone writes a solution for this one day, it will be cool, but so far so.

In general, there is a risk of losing something during a deployment, and a smoke test is definitely needed (a test of the most basic functionality with which a user can execute his user script without problems). It is needed to make sure that something large has not disappeared, which was affected by the changes.

With frequent and responsible deposits, it is better to fully automate the procedure altogether. To do this, continuous integration and continuous delivery approaches were invented. It is important to automate the routine in order to eliminate the human factor. Cover critical parts of the project with automated tests. However, this approach is advisable to use on very large and constantly changing projects, as it entails serious transaction costs (expensive).

Google PageSpeed and Server

Google's site rating in search for several years depends, among other things, on the parameters of the site itself: its loading speed, adaptation for mobile phones, total weight and the absence of unnecessary content when downloading. A module for optimizing images is automatically installed on our servers, since the main problem of passing Google PageSpeed, because of which the site does not fall into the green zone, is too large a volume of images in the transmitted traffic.

In addition to the well-known formats for images .png and .jpg, the format appeared not so long ago.webp, which is not supported by all browsers (in fact, now only Safari and older versions of IE are not friends with it). Plus, it saves a little space on most of the images, and on some graphic files (monochrome, with transparent areas), the space savings are significant. For example, a png file with transparency, which weighs 200 KB, in the format .webp can weigh only 20 KB.

But since some browsers .webp is not supported, you have to be cunning: old—school browsers give pictures in old formats, and those who are already in the topic - give them to .webp. So, the aforementioned module understands which browser is in front of it, and can convert an image on the fly to make it smaller in weight.

If we use Shared hosting, and it is impossible to put the optimization module there, then GPS can give 30 points - only because the images on the site are 5 MB. But if you use a VPS server and configure image optimization on it, you can easily find yourself in the green zone (from 80 points).

The second common problem with GPS after a large volume of images is too heavy JavaScript. And programmers should think about optimizing scripts, and already starting from the layout stage. If a bunch of metrics, counters, additional chats or call trackers are installed on the site, you will not be able to get into the GPS green zone, because all these scripts take milliseconds / seconds / tens of seconds of CPU time when opening the site. Google doesn't like it very much.

In our practice, it is very regularly found that third—party scripts, chats, metrics, etc. added during the expropriation, take the site out of the green zone - into the red. What to do — it is necessary to decide according to the situation and based on the expediency of each script. If it is possible to convince the customer that some of these "chips" are not needed and only harm, it is OK. If this does not work out, you have to postpone loading heavy scripts manually — and then they begin to tighten up 1-3 seconds after the site is fully loaded. This is, in fact, a "deception". With metrics, such a number does not pass.

Yes, sometimes you can determine that the request came from the Google robot, and just don't output all these scripts to it — so we will deceive GPS, and its indicators will be able to be in the green zone. Karmically, this is bad for the site, you can fall under the filter. And for the user, the speed of the site will not improve from this. We need to solve the problem globally, and not put patches.

Server backup

Usually backups are made for three reasons:

protection against unintentional damage — for example, when someone accidentally deleted an information block or an order;
protection against intentional damage — when a website has been hacked and something has been intentionally deleted from it;
protection against equipment failure — computers are still physical objects, and sometimes their hard drives with all the information die.

One of the advantages of Dedicated servers over VPS is that the former usually have not one hard disk, but two, which are a complete copy of each other (mirror raid). If one of them suddenly fails, the data can be restored from the second one.

If we are talking about protection against intentional and unintentional damage, backups can be done on the same server — the server copies and archives the database and settings of all installed programs once a day, and then puts them in a separate folder at home. If we are talking about protection against equipment breakdown (in our experience, once a year a hard disk "falls" on some project), and the project is quite large - it makes sense to buy space on a separate server for backups.

Bitrix has automatic backup creation. If Bitrix has an active license, it can upload its backups to its own cloud. Minus — depending on the license, there are restrictions on the location in this cloud, and therefore only a few recent backups are stored in it.

Most often, we set up backups at the server level: only the server administrator can deploy a backup, but the site administrator (customer) cannot. If the customer wants to be able to deploy the backup independently (immediately suspects that he does not have very qualified content managers who can break everything), we set up automatic backup via Bitrix, and then the backup can be deployed in a few clicks.

We usually make backups so that they are automatically stored on servers for at least two weeks. Once a week, a full backup (master backup) is done, in the remaining 6 days only the modified files relative to this master are resaved. Thanks to this, it was always possible to roll back to a copy of the site any day from this interval.

If there is a lot of space on the server, you can "play around" with the frequency of backup. If the project is large and it uses more than one server (separate servers for the database and separate ones for code execution), you can configure them to make backups of each other — and then you won't need additional space for backups.

The backup includes the entire site code, core, database, crowns (periodic tasks), and all settings of the software installed on the server. To save space, you can make a full backup once a week, and save only those files that have changed daily.

On servers, you can use an SSD or HDD hard drive. Bitrix contains a huge number of small files, and SDD works much faster with them than HDD. Therefore, if possible, it is better to buy hosting with an SSD.

Domains

To upload a project to the network, it is important not only to configure the server, but also to configure the domain so that it leads to this server. In the first part of the basics, we have already drawn a picture of how it all happens.

Do you need consultation about any of these topics, or custom software developed for you? The Anchante team is skilled in these and many other IT spheres. Contact us today to bring your company to the next level.