LeaseWeb Blog http://blog.leaseweb.com LeaseWeb is a quality hosting provider and operator of a first-class worldwide network. Fri, 01 Jul 2016 13:36:37 +0000 en-US hourly 1 Protecting patient data with LeaseWeb http://blog.leaseweb.com/2016/06/29/protecting-patient-data/ http://blog.leaseweb.com/2016/06/29/protecting-patient-data/#respond Wed, 29 Jun 2016 13:56:23 +0000 http://blog.leaseweb.com/?p=4962 In a study released this week by the Ponemon Institute, a U.S. privacy research group, almost 90 percent of surveyed healthcare organizations reported they had at least one data breach involving patient data in the last two years; 45 percent reported more than five breaches. Healthcare records are a prime target for hackers because they are such a […]

The post Protecting patient data with LeaseWeb appeared first on LeaseWeb Blog.

]]>
In secure a study released this week by the Ponemon Institute, a U.S. privacy research group, almost 90 percent of surveyed healthcare organizations reported they had at least one data breach involving patient data in the last two years; 45 percent reported more than five breaches.

Healthcare records are a prime target for hackers because they are such a rich source of information. Stolen credit card numbers expire quickly once the patterns of misuse are discovered. Personal identity information is far more persistent.

In the U.S., healthcare organizations and their business associates are governed by the Health Insurance Portability and Accountability Act, commonly known as HIPAA. This law sets our specific requirements for how patient data must be protected, stored and used.

HIPAA Ready

LeaseWeb’s data center in Manassas, VA – in the DC capital area – was recently recognized as being HIPAA-ready by independent auditor EY. EY noted that LeaseWeb USA’s HIPAA-compliant hosting environment meets all the applicable standards for logical and physical security, operational resilience, incident management, service deployment and change management.

This third-party statement of recognition allows customers in the United States to make the LeaseWeb platform part of their overall HIPAA compliance process, while also providing international customers with the assurance that their data will be well protected.

As the future of healthcare will be very technology-driven, protecting patient data becomes even more important. Medical information doesn’t just live in doctors’ offices anymore. Long-distance “telehealth” allows doctors in urban centers to treat patients far from their facilities. Connected apps or devices – part of the Internet of things – is monitoring everything from glucose levels to heart rhythms. Medical data protection is no longer the exclusive domain of pure healthcare professionals.

Once these innovators get to a certain scale, they face threats beyond just prying hackers looking for data. As the Ponemon study notes, “ransomware, malware, and denial-of-service (DOS) attacks” were cited by healthcare organizations as their top cyber threats.

For an Internet-based business, uptime is money. That’s why LeaseWeb offers a built-in security service that allows cloud-based businesses to easily respond to threats, mitigate attacks (like DDOS) and monitor suspicious traffic to prevent data breaches. A configurable, cloud-based dashboard adjusts customers’ security situations, monitors suspicious traffic and always responds to threats. LeaseWeb Application Security can even be used as a standalone by non-LeaseWeb customers!

If you would like to learn more about how LeaseWeb can help meet your Health IT needs, please contact us today.

The post Protecting patient data with LeaseWeb appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/06/29/protecting-patient-data/feed/ 0
Scaling Our Engineering Department (part 2) http://blog.leaseweb.com/2016/06/28/scaling-engineering-department-part-2/ http://blog.leaseweb.com/2016/06/28/scaling-engineering-department-part-2/#respond Tue, 28 Jun 2016 09:52:45 +0000 http://blog.leaseweb.com/?p=4951 This is the second part in a 2-part series. Read part 1. In my last post I gave an overview of how we started the process of changing the way we do agile development at LeaseWeb by setting goals, engaging our engineering teams, and developing a maturity model to get everyone working to the same […]

The post Scaling Our Engineering Department (part 2) appeared first on LeaseWeb Blog.

]]>
Techsummit by leaseweb in Berlin, 13.4.2016 at Kulturbrauerei. Copyright Raum11/Jan Zappner

This is the second part in a 2-part series. Read part 1.

In my last post I gave an overview of how we started the process of changing the way we do agile development at LeaseWeb by setting goals, engaging our engineering teams, and developing a maturity model to get everyone working to the same standard. In this post I’m going to talk about how we set up our scrum teams, how we get all levels of the business involved in the development process, and how we calculate the cost to provide the most value for the company.

Our current Scrum teams are set up as follows:

  • Product Owner – defines the priorities of the team; responsible for the order in which features are built.
  • Scrum Master – in charge of the scrum process (coach): making sure the team does retrospectives, sprint planning, refinement, coordinating meetings.
  • DevOps – Development and Operations in one team of about 5-8 people.

The first step to creating the teams was to train the scrum masters and product owners. We did this with the help of a company called Prowareness who helped us to develop solid agile principles. Once we had those in place we went on to train the stakeholders outside the engineering department on what scrum is and why we are using it and what parts are needed by different groups to participate. This included not only departments such as sales and support but also the board of directors. This was important to get everyone on the same page and using the same processes. We started 2016 with the sprint counters reset to zero so we could begin our first two-week sprint at 01.

Predictable Development
Our next challenge was prioritizing based on the company’s needs. Instead of just talking about ideas we wanted to write them down so that we could compare and calculate them. To do this we introduced business cases which are ideas from a department that needs a product or service. The department writes the business case itself which includes the value to the company while the cost is calculated by engineering.
Once there is a business case the next step is to develop a roadmap. We do this once every sprint. To accomplish this we set up a meeting to do the following:Scrum-cloud2 250

  • Invite all of the stakeholders to collaborate and discuss the business case and get different perspectives.
  • Get all of the ideas on sticky notes to evaluate and group.
  • Define a Minimal Viable Product; strip away excess features until you have the core.
  • Find dependencies on other teams; decide what features require support from which team to gather resources.
  • Find the priority of the (refined) business case; this is determined by the value vs. the cost. Ranking determined by which gives the most value to the company.

After the stakeholders have shared their ideas and a minimal viable product has been defined along with team dependencies and project priority has been determined the next step is refinement. The only team involved in this part of the process is the development team who meets at least twice a week for an hour. They take the ideas from the sticky notes and turn them into user stories.These user stories focus on the technical details and each one is assigned story points. The number of points is based solely on the complexity of the story and does not specify a time as to when the work will be completed. Complexity is not only determined by the service or product request but also the definition of done. The story isn’t done until the definition is met. Once complexity has been determined the team is asked to assign a number based on that complexity and then the highest and lowest are used to find a median on which to base the user story.

Now that the complexity is known the team can move on to determine the definition of ready. We do this with our INVEST checklist so that we ensure the user story is:

  • Independent – can it be built without other dependencies? It should be small enough to be delivered on its own.
  • Negotiable – can you still discuss with engineers what it should be and how it should look?
  • Valuable – if it isn’t valuable it shouldn’t be built.
  • Estimable – can you estimate how long it will take to build.
  • Small – is it small enough to fit into one sprint? Don’t work on a user story that doesn’t fit into one sprint. If it doesn’t fit break it down into smaller pieces.
  • Testable – at the end of the story can something testable be delivered?

Priorities Based on Value
With our refined user story done the next thing we want to figure out is our team velocity. Team velocity is the amount of story points that can be built during a normal sprint. Once a team has worked a few sprints we can determine on average how many story points can be burned by a team during a sprint. With this information we can calculate the how long it will take a team to do a business case. From there we can figure out the cost of the business case with the following equation: story points of business case / team velocity = # of sprints. The cost of development is the number of sprints multiplied by the number of engineers. Using these data we take estimated value of the business case and subtract the development cost to determine the priority.

Another key goal was to have a responsive organization based on market needs. To achieve this we have product managers who handle all of the non-development tasks. They work to identify trends in the market and write business cases to respond those trends as well as develop strategies, procure new hardware, and determine profit and loss.

The last step before development starts is to present the project to our product steering committee which includes our board of directors. The product owners present their roadmap for what will be done over the next four sprints and commit to that deadline. The committee can veto but this requires a good argument for why they don’t want to go forward. Once approval has been given the teams start development. If there are no new business cases to present, then this meeting is used as an update for the ongoing business cases.

Hybrid_lego_2Transparency
All of the previous goals help us achieve our final goal of transparency. Looking at the business cases per team we can determine which teams deliver the most value and we can see how we need to scale engineering. If there is no business cases for a team to work on then resources may need to be deployed differently. If a team is overloaded with work then it may need to be scaled up.

At the end of each sprint the teams demo their newly released products at an event for the whole company so that we can celebrate our successes. This is also the official handover from development to the team that requested the product or feature so they can then engage with customers.

We also do a retrospective at the end of each sprint to discuss a variety of issues. What went well or didn’t go well? What should change in the next sprint and did we over- or under-commit? Was there any friction within the team? This is the time for everyone to speak up. We also send out an anonymous happiness measurement to each team member before the retrospective and ask them to let us which team they’re from and tell us how they’re feeling, how they like working for the team and whether there are issues. This measurement provides us with an overview to determine trouble spots on teams and address any problems to help teams evolve.

Validate the Feature/Product
After all of the hard work is done and the product has been turned over, how do we determine its actual value to the company? During the development process we add measurements such as counters or logging in order to determine how often the new feature or product is used and how much time or money was saved. Based on these results we can decide whether the user story should be developed further or if we should go in a different direction.

Although we’ve reached our goals to become more agile, we still have a lot of work to do and our process is maturing as we go. We’ve found with this model that not only do teams help each other along and benefit each other with their achievements but that they also have fun reaching a new level of success which in turn helps the entire company grow.

The post Scaling Our Engineering Department (part 2) appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/06/28/scaling-engineering-department-part-2/feed/ 0
Looking back at TechSummit Amsterdam 2016 http://blog.leaseweb.com/2016/06/21/amsterdam-techsummit-2016/ http://blog.leaseweb.com/2016/06/21/amsterdam-techsummit-2016/#respond Tue, 21 Jun 2016 08:30:25 +0000 http://blog.leaseweb.com/?p=4941 LeaseWeb’s annual Amsterdam TechSummit took place on June 2 at the Pakhuis de Zwijger, an old warehouse converted to a high-tech multimedia event center. The summit was sold out with over 315 attendees who came to hear a variety of presentations from professionals focusing on this year’s theme: Designing for Scalability. Those who attended were […]

The post Looking back at TechSummit Amsterdam 2016 appeared first on LeaseWeb Blog.

]]>
20160602-155855-IMG_9652- Bibi VethLeaseWeb’s annual Amsterdam TechSummit took place on June 2 at the Pakhuis de Zwijger, an old warehouse converted to a high-tech multimedia event center. The summit was sold out with over 315 attendees who came to hear a variety of presentations from professionals focusing on this year’s theme: Designing for Scalability.

Those who attended were a diverse assortment of software developers, operations engineers, and managers from companies both large and small. Many of the attendees were local but a good percentage of them had traveled from other countries including Germany, Spain, and even as far as Liberia. All of them were looking to learn about ways to help them grow not only from a technology perspective but how to scale up their engineering teams and how to anticipate and deal with the issues that result from that growth. The summit also provided a good opportunity to network with peers and learn about the challenges they face and what they’ve learned from past mistakes.

The TechSummit opened with a presentation from LeaseWeb’s Head of Product Engineering, Joshua Hoffman, who gave a talk entitled the ‘Seven Deadly Sins of Web Scale’ which drew on his past experience with scaling and the lessons he learned. Many attendees found this talk useful because it provided them with good examples on how to recognize certain ways of approaching problems that can cause issues when growth is needed in the future.

Another presentation that proved to be a highlight for those in attendance was that given by Jorge Salamero Sanz of Server Density called War Games. Jorge talked about the human element of web scale and how properly training engineering teams to react in an efficient manner when outages happen can impact growth. Many who watched said they were looking forward to trying out his recommendations in their own workplaces. These included implementing a checklist for when things go wrong and having developers purposely break things in order to learn how to troubleshoot more efficiently and to be prepared for when people make mistakes.

Reliability was a topic covered by Adam Surák of Algolia who spoke about who is responsible for availability, the importance of monitoring with both internal and external tools to ensure everything is working as expected for both provider and customer, and being aware of dependencies out of your control such as power and network in a data center. He also emphasized the importance of spreading out resources and using multiple vendors in order to mitigate outside issues that can cause downtime.

Recent technologies that are helping companies to scale such as linux containers were discussed in talks by Terrence Ryan of Google and Aanand Prasad of Docker. Aanand spoke about using Docker tools to help engineers streamline their environments in order to narrow the gap between production and development. Terrence covered Kubernetes, a container management system that can help those who have made the switch to containers manage them easily and redundantly at large scale.

20160602-164004-_MG_0680- Bibi VethAttendees looking to find ways to keep their environments secure attended presentations by Daan Keuper of Pine Digital Security and Jessy Irwin of 1Password. Daan talked about how engineers can learn to think like hackers in order to anticipate and prevent breaches of their infrastructure and covered some of the different techniques hackers use to break in. Jessy spoke not only on general good security practices such as strong passwords and encryption but also a variety of tools developers can use to safely communicate and collaborate while writing software to prevent their code and data from being accessed by third parties.

Whether attending to learn about new tools and approaches to building and managing technology at scale, acquiring information on how to maintain a more secure environment, gathering ideas for future growth, or simply looking for confirmation that they are on the right path, LeaseWeb’s TechSummit 2016 Amsterdam provided a venue for technology professionals to come together and share knowledge.

Want to watch any of the talks? Check them out on Youtube!

The post Looking back at TechSummit Amsterdam 2016 appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/06/21/amsterdam-techsummit-2016/feed/ 0
Diving deeper into Kubernetes with Google’s Terrence Ryan http://blog.leaseweb.com/2016/06/14/interview-googles-terrence-ryan-amsterdam-techsummit-2016/ http://blog.leaseweb.com/2016/06/14/interview-googles-terrence-ryan-amsterdam-techsummit-2016/#respond Tue, 14 Jun 2016 09:50:10 +0000 http://blog.leaseweb.com/?p=4931 Terrence Ryan, a developer advocate at Google, gave a talk entitled Containing Chaos With Kubernetes at LeaseWeb’s TechSummit in Amsterdam on June 2nd. We sat down to find out a little bit more about his thoughts on the topic. Interviewer: What issues are facing engineering departments who have just moved to containers? Terrence: One of […]

The post Diving deeper into Kubernetes with Google’s Terrence Ryan appeared first on LeaseWeb Blog.

]]>
20160602-135618-_MG_0513- Bibi VethTerrence Ryan, a developer advocate at Google, gave a talk entitled Containing Chaos With Kubernetes at LeaseWeb’s TechSummit in Amsterdam on June 2nd. We sat down to find out a little bit more about his thoughts on the topic.

Interviewer: What issues are facing engineering departments who have just moved to containers?
Terrence: One of the large issues I’ve seen is how you manage and keep track of them all. Containers are ephemeral, so there is the switching over to the dev practices that supports that.

Having applications and architecture that is fault tolerant in the sense that these containers go away and that should be ok because the data is stored persistently somewhere else. All the app is doing is computing stuff and sending it back to the users. One of the big challenges we’ve seen and one that Kubernetes tends to solve is, “I have all of these containers, how do I keep track of them?” Those are the two problems we see come up. Kubernetes solves the management of the containers.

I: What are some things people are trying that doesn’t work and how does Kubernetes address these issues in terms managing containers?
T: Whenever you move to a new medium – there’s a lot of examples of this in mass media.
Radio was big and then radio switched to television and the same people who were working in radio were now working in television. So then you had tv shows that were just like radio shows but we just added the one camera visual of the performers. Same thing with movies.

I: So, trying to implement old technology in new mindset or framework?
T: Yes. That is the problem; someone is trying to run on Kubernetes exactly like they ran VMs. You can do it but you’re not getting the big benefits. If you’re just doing a radio show you don’t get the benefit of a closeup. So in this case if you’re running a big monolithic application in Kubernetes you’re not getting the advantages of micro services which is very fast deployments, being able to change things without affecting the entire system, all those sorts of things. That, I think, is the big problem and that’s just technological growing pains.

I: How long has Kubernetes been available and have you received feedback from users who have implemented it in production?
T: Kubernetes as a project has been publicly available for two years. We’re still very much in the outreach stage but many users have said that it delivers as promised. I think it’s gotten very exciting in the last nine months or so. A lot of the feature sets that people have been asking for  have been implemented and it’s really taken off.

I: Going forward, do you think container management systems and containers themselves are going to be a more common way of consideration for building out infrastructure?
T: I am of the firm belief that containers will be where VMs were five years ago. Everyone is doing them, it’s just the way to solve most sets of problems. Because that’s the only way you’re able to deal with that much scale is when you can eke out every single cycle from every single processor.

I: Thanks for sitting down to talk with us, Terrence.
T: Thank you.

Want to see the other talks at the TechSummit? You can check them out here.

The post Diving deeper into Kubernetes with Google’s Terrence Ryan appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/06/14/interview-googles-terrence-ryan-amsterdam-techsummit-2016/feed/ 0
Scaling our engineering departments (part 1) http://blog.leaseweb.com/2016/06/13/scaling-engineering-departments-part-1/ http://blog.leaseweb.com/2016/06/13/scaling-engineering-departments-part-1/#respond Mon, 13 Jun 2016 12:02:11 +0000 http://blog.leaseweb.com/?p=4922 In this blog post I’m going to cover how we are scaling our engineering department at LeaseWeb: where we started, the lessons we’ve learned, and how we are hoping to move forward. LeaseWeb was founded in 1997 and we currently have over 350 employees throughout the world with the majority working at our headquarters in […]

The post Scaling our engineering departments (part 1) appeared first on LeaseWeb Blog.

]]>
AdServing_01_FlexibilityIn this blog post I’m going to cover how we are scaling our engineering department at LeaseWeb: where we started, the lessons we’ve learned, and how we are hoping to move forward. LeaseWeb was founded in 1997 and we currently have over 350 employees throughout the world with the majority working at our headquarters in Amsterdam. The engineering department currently has about 100 employees and manages 65,000 servers in seven locations (with more to be added soon).

In the past development was based on shifting priorities rather than planned business cases. Developers weren’t able to concentrate their efforts on thoroughly building, testing, documenting, and presenting a demo for one project before another more urgent one was moved to the head of the line. Operations and Development were working on different schedules and weren’t able to meet each other’s requirements to their mutual satisfaction. We wanted to change this.

Set goals
The development department had already been doing some aspects of agile in the form of scrum but the problem was there was no one coherent scrum in the company. Everyone had started at different times with scrum and teams were working on different sprint numbers so that no one knew when a feature would actually be done. In order to address this confusing situation and to become more agile and more productive we decided to set some goals:

  • Fully autonomous teams that are empowered
  • Predictable development
  • Priorities based on value for the company
  • Transparency on development/roadmap
  • Responsive organisation based on market needs.

Once we had set our goals we had to convince the company that this would be the best way to move forward. We had two competing mindsets: ‘we need to be ready’ vs ‘let’s get started’. There was friction between the two and initially we started to plan and prepare but halfway through an executive decision was made to move forward with the change.

20160602-115755-_MG_0343- Bibi VethEngage and empower teams
The first thing we wanted to do was engage the engineering teams. We came up with a simple way to boost creativity by organizing a quarterly hackathon. For two full days every quarter engineers would have to opportunity to build whatever they wanted and this could be something personal or work related, there are no restrictions. The event starts off Thursday morning with a kickoff where everyone is invited and everyone gets a t-shirt. People then go off and work on their projects until the evening where we take a break and do something fun like laser gaming or trampoline dodge ball. At the end of the two days the engineers are able to demo something they are proud of and a prize is given to the best developer.

What we noticed coming out of these sessions was that engineers were able to work on projects that weren’t getting management or department support until they had the chance to do a demo and show exactly what they had in mind. Many times the engineers were given the green light to finish the project.

The next thing we wanted to do was to empower the teams. We did this by making the entire team responsible for their own product from development to production. We also set up an on-call rotation within each team. The result of this was to make the engineers care about the product and want to fix it so they aren’t woken up in the middle of the night. Also, we wanted to ‘eat our own dogfood’ where each engineer could choose a product whether it be a virtual server or an actual piece of hardware so that they can play with it and give feedback.

Streamline goals and define done
We also developed a maturity model for teams so that we could get everyone to a standard level and have everyone working the same way. Sander Poelwijk goes into detail about this model in his blog post.

In order to get teams to be responsible for their projects you need to have a definition of done. This is a document created by each team and is a checklist you go through so that nothing is considered done until all the boxes have been checked. This could include logging, monitoring, does it have documentation, a successful build, review from team members or cross training. Whatever is needed and it is different for every team. If they launch a product and something goes wrong they go back and update their definition of done so it doesn’t happen again.

In the next part I’ll go into detail about we use Scrum at LeaseWeb as well as how we identify our goals and develop products from planning the business case to staging a demo.

The post Scaling our engineering departments (part 1) appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/06/13/scaling-engineering-departments-part-1/feed/ 0
A maturity model for DevOps teams http://blog.leaseweb.com/2016/05/30/maturity-model-devops-teams/ http://blog.leaseweb.com/2016/05/30/maturity-model-devops-teams/#respond Mon, 30 May 2016 10:14:00 +0000 http://blog.leaseweb.com/?p=4908 Last year, we merged our existing operations and development departments into one Product Engineering department. Since then we have been focusing a lot on coaching all 13 teams and improving their effectiveness In october last year, we attended an excellent talk by Bol.com at Velocity Amsterdam. In this talk they explained their ongoing transition towards DevOps. […]

The post A maturity model for DevOps teams appeared first on LeaseWeb Blog.

]]>
TS_CoreViz_SocMedia_TNWLast year, we merged our existing operations and development departments into one Product Engineering department. Since then we have been focusing a lot on coaching all 13 teams and improving their effectiveness

In october last year, we attended an excellent talk by Bol.com at Velocity Amsterdam. In this talk they explained their ongoing transition towards DevOps. One of the concepts they introduced, was a maturity model to measure and incentivise continuous improvements within a team.

Inspired by the Bol.com talk, we have since developed and implemented a maturity model within our Product Engineering organization which consists of a matrix of four levels in four categories.

Maturity Levels
Within a category, a team progresses through four levels, starting at “Initial Level” which essentially means none of the requirements have been checked, from here a team progresses through “Basic Level” and “Intermediate Level” to achieve the “Target Level” (our ideal DevOps team).

The model is based on our belief that, when a team’s maturity increases, so does their autonomy. At the Initial level, a team needs to focus more on following guidelines/standards while at the Target level they will be actively working on improving those same standards – including the ones in this model.

Maturity model categories
Since we’d like teams to focus on different areas, our model consists of the following four categories:

  1. Culture & People: This section focuses on team happiness, self-organization, sharing responsibilities and failing fast.
  2. DevOps Agility: Mostly focused around how the team applies Scrum, and whether or not internal procedures are known and followed.
  3. Business Value: Since we focus on actual business value when working, this section includes items like continuous feedback loops, stakeholder happiness and having an agile roadmap.
  4. Automation & Tooling: Does the team have zero touch continuous deployment? Are test automated? Is monitoring in place?model

Measurement & Visualization
Each team assesses itself based on the maturity model every 2 sprints (4 weeks), usually with the help of our agile coach. On one of the walls in our office, we have a big visualization of the four categories, with magnets representing the teams, so everyone can see what the current status is. Teams move to the next level when all of the checkboxes in the previous level have been checked.

Over the past months the model has really helped our teams to grow and become more mature, it’s a excellent guide for them to see what to focus on.

A copy of the maturity model checklist can be found here: https://github.com/LeaseWeb/devops-maturity-model

If you would like to learn more about how LeaseWeb implements Scrum, there will be a talk on this at this year’s TechSummit Amsterdam. Or just talk to me when you’re there!

The post A maturity model for DevOps teams appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/05/30/maturity-model-devops-teams/feed/ 0
The Seven Deadly Sins of Web Scale (Part 2) http://blog.leaseweb.com/2016/05/25/seven-deadly-sins-web-scale-part-2/ http://blog.leaseweb.com/2016/05/25/seven-deadly-sins-web-scale-part-2/#respond Wed, 25 May 2016 13:00:41 +0000 http://blog.leaseweb.com/?p=4899 In this 2-part mini-series, Joshua Hoffman examines some of the common issues companies face when designing for scalability. Read part 1 here. In my previous blog I looked at what I call the first three sins of web scale – pride (the refusal to use tools not invented here), envy (the desire for a more […]

The post The Seven Deadly Sins of Web Scale (Part 2) appeared first on LeaseWeb Blog.

]]>
joshua_web_scale_part_2In this 2-part mini-series, Joshua Hoffman examines some of the common issues companies face when designing for scalability. Read part 1 here.

In my previous blog I looked at what I call the first three sins of web scale – pride (the refusal to use tools not invented here), envy (the desire for a more exciting project) and gluttony (ignoring scope and capacity). Today I’ll discuss the other four sins you need to be aware of when building and deploying your app or product. So without further ado, let’s check them out.

Lust: Premature Optimisation
Continuing with our examples of the seven deadly sins of web scale, this next scenario comes from a company I’ll call Audiogarden that needed to build a timeline service. A timeline service is a backend service that generates the “activity feed” for each user on the site and it is a critical part of of the user experience on any “social” site.
When you first start out with building your web application and you’ve never done this kind of thing before you do what’s called the “naive” design. By this I mean that you write your app in your chosen language and you have a database and every time someone posts you put it in the database and when another person logs in you look that information up and display the it to the user. It’s simple enough but it doesn’t scale. In order to scale you need to decouple things and break them apart and if you continue to grow you’ll eventually need to build a service just to perform this task. Pure database queries aren’t going to be able to handle the same load that a caching model can.
To address this issue a talented engineer was tasked with building a replacement to support the growth and scale that was anticipated. The engineer went off and spent months in isolation researching and writing code. Eventually he produced a service built with two tiers. It was a great piece of software: it handled the timeline service well
, it was scalable and reliable in the right ways. So what went wrong? The engineer had committed what I call the sin of Lust or premature optimization. By the time the new timeline service had been put into production (a challenging migration) the requirements had changed. It was so optimized for the original problem that it just didn’t fit anymore and had to be replaced. The result was months of wasted work on a great piece of software that was no longer usable. This is why you should never try to optimize anything until you know what you need to do. It also stresses the importance of staying in touch with stakeholders throughout the development process and not working on projects in isolation.

7 deadly sins of web scale-export.029Wrath: Insufficient Testing
The next example takes us back to Hipster and involves a job that should should be really simple: removing a driver. An engineer was tasked with unloading the IPMI driver from all of the hosts. At the time this happened the company had about 1,000 servers. The IPMI driver is used to communicate to the management board via the host OS rather than talking to the network interface. It was discovered that if you loaded the driver and sent it some commands there was a possibility, because of a bug, that the driver could deadlock the system. If it was never loaded, it wasn’t an issue. There was no functionality that required communicating to the IPMI interface through the host OS so the decision was made to push a config to all servers to blacklist the driver so that it would never be loaded again. This was done without incident.
The next step was to find all the servers where the driver was loaded and unload it. The engineer picked a host to test the procedure and it worked flawlessly. He chose a second host and repeated the same steps and it worked flawlessly as well. After deciding that two hosts were sufficient to test he kicked off a job to the entire fleet to unload the driver. What wasn’t known at the time was that the two test servers were the unusual case and the more common case was that unloading the driver triggered the bug that deadlocked the motherboard. To make things even more complicated, all of the servers were a 4-in-1 type chassis where four hosts are sharing two power supplies. The only way to recover a system in this state was to power cycle it. The problem with this is that if only two of your servers are locked you still have to power cycle the entire system and one or both of the healthy servers still in use might be something important like a master database.
Back in the office engineers started seeing servers going down left and right on the dashboards. I made the call to put the site into read-only mode so that we could find out what was happening. The damage was slightly contained in that the host running the job locked up after about the 800th server had gone down but now we had an even bigger problem. We had done a lot of data center automation and only had two or three technicians on site at the time. Everyone available in the office had to pile into cars and drive thirty minutes to the data center where the rest of the day was spent identifying the locked servers, manually pulling them out of their chassis to power cycle them, and shoving them back in. Eight hours later the site was back online.
I call this the sin of Wrath or insufficient testing. The result was approximately 800 physical servers that required a hands-on fix before the site was up and running. It was a significant outage that even made the news and took a lot of time and manpower to fix simply because the engineer hadn’t done adequate testing.

Greed: Making Stuff Tightly Coupled or Monolithic
In the next case study we go back to Pink Shoe Linux. As an early part of their effort to deliver an enterprise platform they created an online service to allow the management of the software on the servers. It was very helpful for people with fleets of machines who wanted a product like this and it worked very well. The engineers built the service with a bespoke content management system and chose to use the existing database then in use at the company which was Oracle. While they were writing the software they decided to hard code all of the Oracle database queries throughout the entire code base. It wasn’t an issue until customers started to request an on-site version for themselves. This was a lucrative opportunity but the potential for a difficult situation arose if the customers didn’t have an Oracle license and didn’t want to get one.
This is what I call the sin of Greed or making stuff tightly coupled or monolithic. If I had to give one piece of advice to anyone creating a new application it would be to never tightly couple your data source to your application. As you scale things up this will hurt you again and again because depending on the mechanism you’ve chosen the software may be doing things you can’t easily find and fix. You cannot tease out and separate very easily the interactions with data source and application. This is number one challenge in taking something that worked well at a small scale and bringing it up to a very large web scale.
The result of this sin was years of work were needed to clean out and abstract away all of the Oracle database queries before it was a clean and separate code base. In the meantime, in order to satisfy customers, the company had to pay for Oracle licenses to ship with the product for the big customers in order for them to be able to use the service without first having to buy the license themselves.

7 deadly sins of web scale-export.035Sloth: Avoiding Maintenance and/or Documentation
Our last example comes from a company I’ll call Americans on the Internet. This was a company that was such an early adopter of technology that nothing like standardized protocols we use today existed. Everything that was built was proprietary. When they first released their service it ran on one Stratus server which, at the time, were the same kind of servers used by banks and hospitals because they were they were very reliable. That level of reliability came with a price though and these machines were very expensive – costing up to half a million dollars. They could support a lot of users but the problem was that if you went over the capacity even by just a little you had to make another costly purchase in order to run your service.
The decision was made to move to an HP-UX platform; unix servers were dropping in price and a plan was made to migrate the Stratus data onto the new hosts. As this was not a simple task developers immediately started to build new software on unix. In order to make the transfer easy one of the original Stratus engineers decided to build a gateway service to broker the proprietary Stratus protocol to TCP/IP so that all of the new stuff being built could talk to the gateway service.
After a few years, using HP-UX became too expensive because of the need to buy HP servers and licenses so the decision was made to move to linux on commodity hardware. The problem was that no one could find the documentation or the source code for the gateway service. There was no way to be sure exactly what this binary service was doing or how it accomplished its task. The original engineer who had written the gateway program had retired and left the company and no one could find him. This is the sin of Sloth or avoiding maintenance and/or documentation. The fix for this took months of work and there were multiple outages due to many failed attempts before a working replacement was created with good docs and source code.

I hope you find something to take away from these case studies I’ve shared with you to inform the work you do next. If you are starting a new company or a new project, hopefully now you can avoid the Seven Deadly Sins of Web Scale.

The post The Seven Deadly Sins of Web Scale (Part 2) appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/05/25/seven-deadly-sins-web-scale-part-2/feed/ 0
The seven deadly sins of web scale (Part 1) http://blog.leaseweb.com/2016/05/19/seven-deadly-sins-web-scale-part-1/ http://blog.leaseweb.com/2016/05/19/seven-deadly-sins-web-scale-part-1/#respond Thu, 19 May 2016 09:30:17 +0000 http://blog.leaseweb.com/?p=4882 Throughout my career I’ve had the opportunity to work at a variety of different companies both large and small. They each had their own set of unique challenges regarding growth but one thing I noticed with time and experience was that the solutions to the problems they faced were not specific to the company itself. […]

The post The seven deadly sins of web scale (Part 1) appeared first on LeaseWeb Blog.

]]>
joshua_web_scale_part_1Throughout my career I’ve had the opportunity to work at a variety of different companies both large and small. They each had their own set of unique challenges regarding growth but one thing I noticed with time and experience was that the solutions to the problems they faced were not specific to the company itself. The approaches that were taken and the lessons that were learned could be extrapolated and applied to many of the situations facing a company looking to expand and grow technically.

There is a concept in some religions that before you save a sinner you have to tell them how they have sinned. In other words, if someone doesn’t know what the problem is they won’t be able to change. For a company just starting out, there are no wrong ways to build and deploy your app or product. Once you begin to grow however, you realize there are things you didn’t know and that some or all of the decisions that you made at the beginning were mistakes. This is the point where you need to decide how to address these issues. New companies are started all the time so I decided to draw from my experience to put together what I call the Seven Deadly Sins of Web Scale using seven real world examples from my career.


Pride: Refusal to Use Tools Not Invented Here
The first case study involves a company we’ll call Boohoo that operates many data centers. Like most companies that started early on in the technology scene they had to build their own management platform because there wasn’t a lot of software available that met the needs of running a large internet application or had the ability to manage a large amount of hardware reliably and efficiently. The first data center was built to suit the requirements at the time but as they grew they needed to expand. This was a fork in the road for Boohoo: did they want to enhance and build on their first platform or start over with something completely different?

A new team of engineers was tasked with building out the second data center and instead of deciding to enhance or fix the current technology they wanted to build something completely new. They would build it right this time, it would fix all of the problems, and then they would migrate everything from the old platform to the new and everything would work great.

These engineers committed what I call the sin of Pride or the refusal to use tools not invented here. Instead of using in-house technology or any of the available open-source options they decided they could build it better each time.
The end result is that years later they now have eight data centers that run on eight different data center management platforms and have learned nothing in the process. The job for the engineers who have to support them is much more difficult because not only do they have to code for multiple APIs in order to communicate between the data centers and their different platforms but they must also support multitudes of other software that have been built to abstract on top of them. It also means that troubleshooting any particular team’s software is challenging at best.

7 deadly sins of web scale-export.019Envy: The Desire For a More Exciting Project
Our next case study comes from a company we’ll call Pink Shoe Linux. Pink Shoe Linux wanted to migrate from a bespoke document publishing system to Docbook. Docbook is a specification that can be used for authoring material on the web or published in printed form. You write the content in XML and it gives you some advantages such as having a single source that can generate multiple outputs as well as being able to do things like make an instructor course manual that has quiz answers and a student version that does not. The challenge is in how you generate the output and there are many tools to choose from.

Pink Shoe was already using many Docbook features but with a proprietary build system that needed to be maintained by the people who were authoring the coursework. The decision was made to migrate to an open source standard tool chain for Docbook called Publican. The advantage with Publican was that there would be no need to maintain internally built tools and there were already available resources from within the organization. An engineer was chosen and it was estimated that the task would take a few weeks to get the toolchain working, define the workflow, and write up some documentation.

After several weeks had gone by I happened to take a look at the code and noticed something that rang a few alarm bells. It appeared that the code as written was able to function with or without Docbook. Digging into the situation a little further I learned that the engineer had always wanted to write his own publishing system and had taken this project as an opportunity to do that. In order to meet the requirements he added in the ability to support Docbook. The engineer had committed what I call the sin of Envy or the desire for a more exciting pro
ject. A lot of day to day work isn’t glamorous or fun but keeping it simple and effective is, to me, the right answer. This a challenge because you want to keep engineers happy and give them exciting projects to work on but if you’re committed to the success of the company and you want to build out everything the company will need to handle the load it’s going to face then you have to have discipline in situations like these.

The result of this was a new bespoke tool that replaced the old one that mostly worked with Docbook however it broke a critical feature that was used to export portable translation objects. These objects are sent to a translation company which translates the material into other languages and then sends it back so that it can be imported and you have multiple language versions of your document. Two weeks before the publishing deadline to send the objects to the translator and almost four months after the project started I was called in to fix the situation. Most of the existing code had to be thrown out and the standard tool was able to be implemented within the deadline. What should have been a relatively simple project ended up with months of work in the trash.

Gluttony: Ignoring Scope and Capacity
The next example comes from a company I’ll call Hipster. This is a company that started out small and experienced such rapid growth that the peak traffic for the front end from when I first started had become the low point after only a few months. Everything had been built so quickly that there hadn’t been time to implement network metrics. The decision was made to instrument everything and turn it on. At the time there we were using an in-house metrics platform built on OpenTSD running on an Hbase cluster. The Network team set everything up and started shipping metrics without telling anyone. Over 2000 network ports started sending every frame to the Hbase cluster which was already being used to monitor critical systems that were running the site. The whole thing fell over because of the sin I call Gluttony or ignoring scope and capacity. The sheer volume of new metrics killed dashboard performance for everyone which then made troubleshooting actual issues almost impossible.

If you work in engineering everything you do requires capacity planning. It requires an awareness because there is no unlimited resource. Whether you are using an internal resource or a vendor such as a CDN or cloud provider like LeaseWeb you need to plan at every stage. We’ll cover the next four deadly sins of web scale in part 2.

The post The seven deadly sins of web scale (Part 1) appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/05/19/seven-deadly-sins-web-scale-part-1/feed/ 0
Remote Management: how it secures and gives you more control over your Bare Metal Servers http://blog.leaseweb.com/2016/05/18/remote-management-secures-gives-control-bare-metal-servers/ http://blog.leaseweb.com/2016/05/18/remote-management-secures-gives-control-bare-metal-servers/#respond Wed, 18 May 2016 10:17:01 +0000 http://blog.leaseweb.com/?p=4869 LeaseWeb is always striving for the best customer experience and we believe that putting you in the driver’s seat is a key factor in this. After all, the more control you have, the faster you can get things done. To help you with this, we automate important self-service processes that enable you to manage your […]

The post Remote Management: how it secures and gives you more control over your Bare Metal Servers appeared first on LeaseWeb Blog.

]]>
bareMetal_USP_controlLeaseWeb is always striving for the best customer experience and we believe that putting you in the driver’s seat is a key factor in this. After all, the more control you have, the faster you can get things done. To help you with this, we automate important self-service processes that enable you to manage your infrastructure.

Recently we launched a new free feature for our Bare Metal and Dedicated Server products which gives customers secure access to their server’s IPMI interface. The IPMI interface is a very powerful tool that can be used for many things, especially:

  • For debugging issues if your server becomes unreachable
  • Installing an operating system which LeaseWeb does not offer through the Customer Portal
  • Customizing your OS installations

All these actions are made much easier by giving access to the IPMI interface.

LeaseWeb already offered access to the IPMI interface on request by assigning a public IP address so it would be accessible over the internet. However, IPMI interfaces are not known for their security, so exposing them over the internet is far from ideal.

That’s why, during the past year, we invested in our core network infrastructure to be able to offer secure IPMI access through an internal private network which we refer to as ‘The Remote Management Network’.

1

People with an existing IPMI device on a public IP address are encouraged to switch to the new secure Remote Management Network. Please contact one of our sales representatives to start this process.

How to access the Remote Management Network
You connect to the remote management network by setting up a VPN connection. The technology we use is OpenVPN. It is open source, secure and there are clients available for every operating system. If you have dedicated servers in multiple LeaseWeb data centers you need to establish a VPN connection per data center.
2You can download OpenVPN connection profiles to establish these connections when logging in to the LeaseWeb Customer Portal. To see if Remote Management is available for your dedicated server, simply go to the server management page and click the Remote Management tab.

If Remote Management is available, you can view the IP address and credentials for your server’s IPMI interface. Otherwise, simply contact one of our sales representatives to check the available options.

We’re very excited to offer you this new feature. Now we would love to hear from you about how you think we could make it even better – or what other options you’d like to have to manage your servers!

The post Remote Management: how it secures and gives you more control over your Bare Metal Servers appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/05/18/remote-management-secures-gives-control-bare-metal-servers/feed/ 0
How the cloud is like Minecraft http://blog.leaseweb.com/2016/05/17/cloud-like-minecraft/ http://blog.leaseweb.com/2016/05/17/cloud-like-minecraft/#respond Tue, 17 May 2016 10:00:55 +0000 http://blog.leaseweb.com/?p=4864 Recently I was reading this article in the New York Times about Minecraft. It’s a story about how Minecraft is changing the way children play, learn and create things. It does so by bringing them into a digital environment that provides the freedom to let them fully design their own world, complete with houses, vehicles […]

The post How the cloud is like Minecraft appeared first on LeaseWeb Blog.

]]>
Minecraft_Logo_03Recently I was reading this article in the New York Times about Minecraft. It’s a story about how Minecraft is changing the way children play, learn and create things. It does so by bringing them into a digital environment that provides the freedom to let them fully design their own world, complete with houses, vehicles and more. Players start mining and expand their environment by chopping trees, mining blocks and creating their own tools. In Minecraft, the article goes, you’re provided with a toolbox to do so, which allows you to be creative and build things. The physical equivalent of Minecraft is somewhat like Lego.

Fortunately, in the Minecraft world, things look simple but can get pretty advanced as well. By using a resource called ‘redstone’, players can build their own machines to make life easier, automating things that are time-consuming or boring. The components used to do this are very similar to the components that are used to design computers – logic gates and digital signals. What’s more, just like Lego, you have to buy the blocks beforehand and people will complain if you need a newer set because the old one doesn’t have the blocks that you need.


Pickaxes and Chef
As someone working in technology during the day but who loves to play Minecraft in my free time, I couldn’t help seeing a parallel with IT infrastructure. On a daily basis, we are providing building blocks and seemingly simple tools that allow customers to build their own solutions as well as to automate things that are time-consuming or boring. What if the cloud is our Minecraft?

2656299-screen10-pngIn the cloud, the focus is on flexibility and automation. We provide you with an environment as well as a number of (basic and more advanced) tools to create whatever you like. You’re not using pickaxes or furnaces but, rather, Ansible or Chef. Sometimes – like in the Minecraft world – our customers come up with ways to use our infrastructure we didn’t even think about yet. Seemingly simple building blocks can be used to design very complex things. After all, at the end of the day the device you’re reading this on is just made of logic gates and digital signals.

In Minecraft, the player doesn’t deal with physical things. (S)he doesn’t have to manipulate items or click things together – the way we used to play with Lego or other construction toys. Lego solved this issue smartly by designing Minecraft Lego sets. So whenever it makes sense, we can choose to go for the hardware. Because it fits better, or you don’t need the flexibility, or just because it’s easier to ask for a box of Legos. This is not too different from IaaS: whenever you don’t need that flexibility in your cloud infrastructure, you can go for bare metal or more traditional dedicated servers.

Hybrid_lego_2It’s not just a game
There are some situations where that parallel goes off track. I can’t combine the things I built in my Minecraft world with the physical Lego set. It’s not yet possible to create a single build that connects those worlds. In IaaS, this is different. Customers can select bare metal, cloud and other infrastructure components, combining them whenever it makes sense to do so. And in infrastructure it often does. The challenge is for the supplier to provide those simple tools that allow this to be done easily. Another huge difference is that your business might be dependent on your cloud whereas what you do in your Minecraft world has little real-world business impact. Carelessly mining blocks in a game can be fun but business-critical infrastructure needs to be well thought-out and carefully designed.

One last important difference is finding your way. In Minecraft, you’re dropped into this new world and part of the game is exploring and figuring out how to survive. In IT infrastructure, there’s more guidance and help available. This is where LeaseWeb comes in. Talk to us, and we’ll help you figure

out what combination of Lego or Minecraft – or both – you need, all around the (real) world. We’ll even help you build it, unless you want to connect those blocks together yourself!

The post How the cloud is like Minecraft appeared first on LeaseWeb Blog.

]]>
http://blog.leaseweb.com/2016/05/17/cloud-like-minecraft/feed/ 0