Linx SAFE - Disaster Recovery in AWS

Scalable and cost-effective solution for application and data recovery. Guaranteed recovery to the RPO in a matter of seconds and to the RTO in a matter of minutes in case of incidents.

WE OFFER RECOVERY FROM

Unexpected seizure of your production servers

Data loss as a result of server malfunction or destruction

Loss of access to remote servers

Power outage at your main data center

Service Components

Direct
connect

A reliable dedicated channel protected from IP blocking

DISASTER RECOVERY (DRaaS)

A twin for your physical or virtual servers to minimize losses caused by infrastructure failure in the event of a natural or man-made disaster.

Amazon WEB
Services

The world’s largest public cloud developed and supported by Amazon Inc. with the highest service quality standards.

linx safe benefits

enable data replication for your project with a reliable provider and migrate to aws in a single day

Easily accessible smart console with advanced statistics

Setup and launch of data replication to AWS in one day

Secure dedicated channels to data centers in Frankfurt and Stockholm

No burden of infrastructure deployment or hardware and software maintenance

product map

We combined several services into one product to ensure fault-tolerance of your corporate data and make it accessible 24/7

fast recovery

of applications and data using an instance saved right before the incident or at another moment in time.

single process

for testing, recovery, and failover actions for a wide range of applications, which requires no niche competencies to manage.

elastic recovery

for more flexibility, as well as the option to add or remove replicated servers as needed.

DR in global clouds: expert opinion

Sooner or later, all hardware is subject to malfunction. Ignoring that creates a risk of unforeseen incidents.  

To mitigate the consequences of system failures, a company should prepare and follow a disaster recovery plan (Disaster recovery or DR-план).

What role global cloud providers play in today's DR plans, says Evgeny Makarin, head of the Linxdatacenter project and solution development group. 

Fast start

ensure reliable protection of your data at linxdatacenter in just a few steps

application

We meet you, analyze your infrastructure, and do the paperwork

preparations

We provide data security, prepare your servers for data transfer, and provide the links to connect to recovery facilities

development

We launch the backup infrastructure for your services at AWS

testing and launch

We test data backup and disaster recovery and launch the system

support and consultation 24/7/365

We provide around-the-clock technical support and professional advice in English and Russian

IAAS PRICING

basic

12.29 ₽/hour

business

94.23 ₽/hour

enterprise

496.41 ₽/hour

individual

A data protection and disaster recovery solution tailored to your particular needs

Licenses and certificates

DR in global clouds: improving disaster recovery reliability  

Evgeny Makaryin, Head of Product and Solution Development Group Linxdatacenter

Sooner or later, any equipment fails. Believing that IT equipment will work for years and that the server room in the office will never fall down is dangerous.

To prevent such incidents from coming as a surprise, it is important to build their probability into the disaster recovery (DR) plan. Evgeny Makaryin, head of Linxdatacenter project and solution development group, explains what role global cloud providers play in today's DR plans.

The axiom of business continuity today: only with the availability and regular testing of DR-plan, the company can count on the predictable timing of resumption of normal operation in case of failure of infrastructure and IT systems.

Creating and implementing a corporate DR plan requires a lot of resources, both during the development and launch phase and when maintaining systems and processes in a state of constant readiness.

If the recovery process is not automated to a high enough level, engineers will have to do too many manipulations manually during a disaster. And human errors in conditions of severe stress can lead to long delays, incorrect operation of IT systems and applications deployed in them. We should not forget about such situations as various kinds of business inspections by the authorities, execution of court decisions, and so on.

"Exorbitant" level of protection

In recent years, it has become increasingly common to implement a DR plan by leveraging the resources of public clouds, including global providers.

The triumvirate of "big clouds" - Amazon Web Services, Google Cloud Platform and Microsoft Azure - provides customers with maximum automation of processes at all stages from installation of the DR solution to further management, monitoring and incident response.

In principle, it is becoming increasingly profitable for companies to host in public clouds even for routine tasks, and in the event of a major disaster, public clouds have no competitors today for the fastest possible recovery.

Why can't we just do disaster recovery in Russia?

First of all, global cloud providers have essentially endless computing power and disk space, allowing them to host IT infrastructure for customers of virtually any size.

There is no need to pay for the provider to hold resources for guaranteed recovery specifically for the company. With the global cloud as a DR site, businesses can be assured that everything will work without having to overpay for resources the company does not use.

In Russia, providers of a similar scale are just beginning to emerge, and they do not yet have a comparable combination of capacity, experience, abundance of services available to customers, and quality of service.

Why would you want this

Suppose your IT infrastructure is in a data center that suddenly caught fire. The picture is real and very scary. In fact, everyone remembers the case of OVH in France - even companies that stored backups in remote data centers had to spend hours and days recovering from backups and bringing up the infrastructure.

This happens because of manual manipulation and lack of regular tests. That is, the team has an idea of what needs to be done, but if the system is not tested regularly and drills are not conducted, recovery often does not go according to plan.

Another scenario: your clients consume services from abroad, while your infrastructure is located in the cloud of a local provider in Russia. Because of possible political conflicts, access from abroad to sites in Russia and vice versa is blocked, and for an indefinite period of time.

If backups are also stored abroad, manual recovery will take hours or even days. If there are no backups there, and all data is stored only in Russia, then you need to urgently contract with a foreign hosting, somehow transfer the data there, raise the infrastructure manually, and only then fully recover from backups. Downtime in this scenario will be measured in weeks.

DR in the global cloud allows you to have a fully working infrastructure in an average of 20 minutes after initiating a disaster recovery plan.

Preparing

Many present the process of migrating to the global cloud as incredibly complex and costly. However, providers offer tools for painless migration for DR projects.

As a rule, such solutions consist of several main blocks - a console to manage the entire process and agents (special software) installed on the customer's original Windows or Linux servers. You need an account in a public cloud, where the so-called staging area with a replication server is automatically deployed. The data replicas are also stored there.

Nearby, a target zone is prepared where the client's virtual machines will start in the event of a test or "combat" launch of the DR-plan.

This solution allows maximum automation of the migration process for dozens or even hundreds of servers.

Ready, set, DR!

After server data is initially transferred to the global cloud, continuous block-by-block replication is enabled. This enables an RPO (target recovery point to which a "downed" IT system is recovered) of a few seconds.

In addition to data replication, it is important not to forget that during a disaster a lot of time is spent on setting up networks and security in the target area, so this work must be done in advance.

In global clouds, you create a security - group, virtual networks, preferably with internal addressing similar to the original infrastructure. And then in the console, you can configure granularly for each individual server the parameters with which this server will start when you activate the DR-plan.

For example, you can set the type of machine, i.e. how many cores and memory will be assigned, billing method, subnet, IP address, security groups and so on. For many settings, it is possible to automatically select parameters similar to the source server.

In addition, servers can be grouped together and the order in which these groups start can be configured as part of a disaster recovery plan. All of these operations in global clouds are automated and require minimal manual manipulation in the event of an incident or during testing.

Access issues

To access the DR capabilities of big clouds from Russia, you need to look for certified partners of global providers.

What are the advantages?

First, such a partner will audit the infrastructure. He will assess which components of the client's IT system require protection - hardware, virtualization, operating systems, applications. He will find out how much resources are allocated to them and how much is used. He will also assess how connectivity is organized and bandwidth utilization, which is also very important.

Second, the partner develops a disaster recovery plan, which will describe in detail what threats the company is protected from and how. Describes who is responsible for what, with whom and how to contact during testing or combat disaster recovery. Prescribes the procedure for updating the DR-plan.

Third, creates the infrastructure for the client in the global cloud. It prepares the right accounts, networks, access lists, security groups, routes, gateways, load balancers, and so on.

The partner also generates agents to be installed on the source servers and sends download links and instructions on how to install them. After the client installs the agents and opens the required ports, the servers are displayed in the control panel and replication begins. Next, a template is configured in the console, so that when restoring to the global cloud, virtual machines start with the right settings and in a certain order.

A DR plan cannot be considered valid until it has been tested. The partner will conduct tests, both in the form of a pilot and regular performance tests.

Next, the partner will take on the management of DR - infrastructure and keeping the DR-plan up to date. This is a very important and quite time-consuming part of the strategy to keep the business up and running.

We at Linxdatacenter were the first in Russia to develop a turnkey solution for backup and disaster recovery of client infrastructure, which allows for a reliable and secure connection between the client and the global cloud service in the shortest possible time, and, if necessary, to get a fully working infrastructure within an average of 20 minutes after the initiation of a disaster recovery plan. 

Write to us

How we optimized customer data center management

Data center is a complex IT and engineering object, which requires professionalism at all levels of management: from managers to technical specialists and executors of maintenance works. Here's how we helped our client put operational management in corporate data centers in order.
 

Taras Chirkov, Head of Data Center in St. Petersburg  in St. Petersburg 

Konstantin Nagorny, chief engineer of data center in St. Petersburg.  in St. Petersburg 

Data center is a complex IT and engineering object, which requires professionalism at all levels of management: from managers to technical specialists and executors of maintenance works. Here's how we helped our client put operational management in corporate data centers in order.  

Management is in the lead 

The most advanced and expensive IT equipment will not bring the expected economic benefits if proper processes of engineering systems operation in the data center, where it is located, are not established.  

The role of reliable and productive data centers in today's economy is constantly growing along with the requirements for their uninterrupted operation. However, there is a big systemic problem on this front.  

A high level of "uptime" - failure-free operation of a data center without downtime - depends very much on the engineering team that manages the site. And there is no single formalized school of data center management.  

And there is no single formalized school of data center management.    

Nationwide  

In practice, the situation with the operation of data centers in Russia is as follows.  

Data centers in the commercial segment usually have certificates confirming their management competence. Not all of them do, but the very specifics of the business model, when a provider is responsible to the client for the quality of service, money and reputation in the market, obligates them to own the subject. 

The segment of corporate data centers that serve companies' own needs lags far behind commercial data centers in terms of operational quality. The internal customer is not treated as carefully as the external customer, not every company understands the potential of well-configured management processes. 

Finally, government departmental data centers - in this regard, they are often unknown territory due to their closed nature. An international audit of such facilities is understandably impossible. Russian state standards are just being developed.  

This all translates into a "who knows what" situation. "Diverse" composition of operation teams composed of specialists with different backgrounds, different approaches to the organization of corporate architecture, different views and requirements to IT departments.  

There are many factors that lead to this state of affairs, one of the most important is the lack of systematic documentation of operational processes. There are a couple of introductory articles by Uptime Institute which give an idea of the problem and how to overcome it. But then it's necessary to build the system by your own efforts. And not every business has enough resources and competence for that.  ⠀⠀  

Meanwhile, even a small systematization of management processes according to industry best practices always yields excellent results in terms of improving the resilience of engineering and IT systems.  

Case: through thorns to the relative order 

Let's illustrate by the example of an implemented project. A large international company with its own data center network approached us. The request was for help to optimize the management processes at three sites where IT systems and business-critical applications are deployed.  

The company had recently undergone an audit of its headquarters and received a list of inconsistencies with corporate standards with orders to eliminate them. We were brought in as a consultant for this as a bearer of industry competence: we have been developing our own data center management system and have been educating on the role of quality in operational processes for several years.  

Communication with the client's team began. The specialists wanted to get a well-established system of data center engineering systems operation, documented on the processes of monitoring, maintenance and troubleshooting. All this had to ensure optimization of the infrastructure component in terms of IT equipment continuity.  

And here began the most interesting part.  

Know thyself 

To assess the level of data centers in terms of compliance with standards, you need to know the exact requirements of the business to IT systems: what is the level of internal SLA, the allowable period of equipment downtime, etc.  

It became clear right away that the IT department did not know exactly what the business wanted. There were no internal criteria of service quality, no understanding of the logic of their own infrastructure.  

Colleagues simply had no idea what the permissible downtime for IT-related operations was, what the optimal system recovery time in case of a disaster was, or how the architecture of their own applications was structured. For example, we had to figure out whether a "crash" of one of the data centers would be critical to the application, or if there were no components affecting the application.  

Without knowing such things, it is impossible to calculate any specific operational requirements. The client recognized the problem and increased coordination between IT and the business to develop internal requirements and establish relationships to align operations.  

Once an understanding of the IT systems architecture was achieved, the team was able to summarize the requirements for operations, contractors, and equipment reliability levels.  

Improvements in the process 

Our specialists traveled to sites to assess infrastructure, read existing documentation, and checked the level of compliance of data center projects with actual implementation.  

Interviews with the responsible employees and their managers became a separate area of focus. They told what and how they do in different work situations, how the key processes of engineering systems' operation are arranged.  

After starting the work and getting acquainted with the specifics of the task the client "gave up" a little: we heard the request "just to write all the necessary documentation", quickly and without deep diving into the processes.  

However, proper optimization of data center "engineering" management implies the task to teach people to properly assess the processes and write unique documentation for them based on the specifics of the object.  

It is impossible to come up with a working document for a specific maintenance area manager - unless you work with him at the site continuously for several months. Therefore this approach was rejected: We found local leaders who were willing to learn themselves and lead their subordinates.  

Having explained the algorithm of documents creation, requirements to their contents and principles of instructions ecosystem organization, for the next six months we controlled the process of detailed writing of documentation and step-by-step transition of the personnel to work in a new way. 

This was followed by a phase of initial support for work on the updated regulations, which lasted one year in a remote format. Then we moved on to training and drills - the only way to put the new material into practice.  

What's been done 

In the process, we were able to resolve several serious issues.  

First of all, we avoided double documentation, which the client's employees feared. To this end, we combined in the new regulations the regulatory requirements applied to various engineering systems as standard (electrics, cooling, access control), with industry best practices, creating a transparent documentation structure with simple and logical navigation.   

The principle of "easy to find, easy to understand, easy to remember" was complemented by the fact that the new information was linked to the old experience and knowledge of the employees. 

Next, we reshuffled the staff of service engineers: several people turned out to be completely unprepared for the change. The resistance of some was successfully overcome in the course of the project through the demonstration of benefits, but a certain percentage of employees turned out to be untrained and unresponsive to new things.  

But we were surprised by the company's frivolous attitude to its IT infrastructure: from the lack of redundancy of critical systems to the chaos in the structure and management.  

In 1.5 years the engineering systems management processes have been pumped up to the level that allowed the company's specialists to successfully report "for quality" to the auditors from the headquarters.  

With the support of the operating component development pace the company will be able to pass any existing certification of data centers from leading international agencies.  

Summary 

In general, the prospects of consulting in the field of operational management of data centers, in our opinion, are the brightest.  

The process of digitalization of the economy and the public sector is in full swing. Yes, there will be a lot of adjustments in the launch of new projects and plans for the development of old ones, but this will not change the essence - the operation should be improved at least to improve the efficiency of already built sites.  

The main problem here: many managers do not understand what thin ice they are walking on, not paying proper attention to this point. The human factor is still the main source of the most unpleasant accidents and failures. And it needs to be explained.  

Government data center projects are also becoming more relevant now and require increased attention in terms of operations: the scope of government IT systems is growing. Here, too, the development and introduction of a system of standardization and certification of sites will be required.  

When the requirements to public data centers in Russia at the level of legislation will be reduced to a standard, it can be applied to commercial data centers, including for the placement of public IT resources.  

The work in this area is ongoing, we are participating in this process in consultation with the Ministry of Digital and by building competencies for teaching courses on data center operation at the ANO Data Center. There is not much experience on such tasks in Russia, and we believe that we should share it with colleagues and clients. 

Disaster recovery in AWS

BEST, money transfer and payments operator

business challenge

The customer faced a technical issue with a persistent BGP session flag with Linxdatacenter hardware. We examined the problem and found out that one of customer’s hosts was under a DDoS attack.

Because of the distributed nature of the attack, traffic couldn’t be filtered effectively, and disconnecting the host from the external network wasn’t an option. The attack stopped after changes in the server configuration, but resumed the day after. A 5.5 Gbps attack overloaded the junctions with internet providers, affecting other Linx Cloud users. To mitigate the effects of the attack, we employed a dedicated DDoS protection service.

Solution

To ensure the continuous availability of resources hosted in Linx Cloud, we rerouted all the customer’s traffic through StormWall Anti-DDoS system. The attack was stopped within half an hour. To prevent future cyberattacks, we organized all connections to the customer’s resources through the StormWall network.

client:

BEST, money transfer and payments operator

business challenge

The customer faced a technical issue with a persistent BGP session flag with Linxdatacenter hardware. We examined the problem and found out that one of customer’s hosts was under a DDoS attack.

Because of the distributed nature of the attack, traffic couldn’t be filtered effectively, and disconnecting the host from the external network wasn’t an option. The attack stopped after changes in the server configuration, but resumed the day after. A 5.5 Gbps attack overloaded the junctions with internet providers, affecting other Linx Cloud users. To mitigate the effects of the attack, we employed a dedicated DDoS protection service.

Solution

To ensure the continuous availability of resources hosted in Linx Cloud, we rerouted all the customer’s traffic through StormWall Anti-DDoS system. The attack was stopped within half an hour. To prevent future cyberattacks, we organized all connections to the customer’s resources through the StormWall network.

Thank you for your inquiry, we will get back to you shortly!