HIGHLY AVAILABLE INFRASTRUCTURE
Author: John McDowell, Lead Enterprise Architect
When you pick up the phone, you want a dial-tone every time. Business areas expect the same thing from an IT Infrastructure operation. When companies use hosted services, they are expected to work consistently. With a modern IT Infrastructure management environment, there are a variety of methods to keep systems operating continuously. This ranges from Uninterruptible Power Supplies (UPS) to features more obscure, such as the recovery time of modern data routing protocols.
As IT Infrastructure specialists, we need to build a highly available environment in a layered approach. It starts from the ground up with the selection of physical space. One needs to start by choosing a data center location that is not easily impacted by local events; the building shouldn't be sitting by a river that floods every spring. Next, layer on redundant power feeds combined with UPS’s that will dual feed into every cabinet. The cabinets themselves need to be arranged in a manner consistent with proper air-flow, keeping in mind that detailed heating/cooling plans are needed. If components start failing in August from overheating, then it is time to rethink the cooling plan.
Once the basic facilities are planned out, you move up the stack. The selection of telecommunications partners is critical to the success of any business. If the wrong partner is selected or the design/implementation of the services is done poorly, your business will be cut-off at inopportune times. Losing communications and having service down-time will undermine credibility and is a strong driver for clients to look elsewhere. Planning to go with multiple partners or staying with a single-source is an important decision. On one side you potentially avoid the vendor-wide outage, on the other side you have to successfully keep multiple services with varied implementations running optimally. A multi-vendor plan assumes diverse paths of entry; otherwise a single telephone pole event could take them both out at once.
From here, the decisions don't get any easier. The data networking team needs to select and implement the right routing protocols. If they don't understand your needs, the applications could flounder while the network is rebuilding itself from a minor event. The storage teams have significant decisions to make around the type of work required. Does the business have a need to do a lot of real time transactions, historical analysis, fast intake of mass-data, immediate duplication? Understanding these requirements helps the storage team to balance between high-cost chip-speed memory to long term storage as well as the placement of systems. They also need to manage replication of data to ensure your systems recover quickly following a component outage. Compute (CPU) resource services face similar concerns about what the application needs and where it is needed.
Once the essential resources (memory, CPU, network, power) are finalized, you enter the realm of application High Availability. Thanks to the world of virtualization, most operating systems are really application resources that run on abstracted hardware resource pools. Virtual operating systems can reserve resources on any hardware pool they are allowed to reach and reattach themselves to the storage pools. The teams that manage these virtual operating systems need to understand the designs of all the resources they depend on so as not to mismatch availability plans.
These virtual operating systems already offer a default level of High Availability based on the designs of all the resource areas they utilize. A virtual OS may be able to recover itself in real-time or near realtime as it transitions from a failing hardware resource pool to a healthy pool. From an application level one may also incorporate multiple systems to handle load, High Availability pairs, live-live services, georedundancy, cloud based recovery options, etc. Just like the rest of the stack, if the application owners make improper assumptions about the resources they depend on, there could be an outage if resource plans are not aligned. Cross-team planning is paramount to having a fully realized highly available solution.
So how do we best aid a business function as Infrastructure specialists in terms of High Availability? We design our services from the ground up to provide the highest level of availability through building reliable (quality components) and resilient (well designed) systems. We work with our business areas to understand their specific needs and do not try to apply a one-size fits all approach to applications. Above all, we never take any component for granted; they all count.
BUILDING A RELIABLE AND RESILIENT IT ENVIRONMENT THROUGH CULTURE CHANGE
Author: David Howie, Enterprise Architecture Technology
The way of architecting business applications has gone through multiple transformations in the last 30 years, from a centralized computing platform with dumb terminals, to a distributed model with desktop applications. The transformation continued with HTML generated by a collaboration of server side applications and a Service Oriented Architecture (SOA). The current paradigm consists of dynamic HTML in combination with a server side compute model, a vast SOA deployment, and the latest in REST/JSON implementation patterns.
The IT infrastructure has continued to transform as well, with the proliferation of firewalls, load balancers, web servers, web application servers, messaging products, Enterprise Service Buses, specialized appliances, multiple Database Management Systems, and so on.
Business models have transformed as well. Customers and business partners expect 24 x7 availability. Business users may span across multiple time zones or the other side of the planet. Perhaps there is a market for the internal business applications and your organization is now a provider of Software as a Service (SAAS). And aside from all of that, the expectations of the internal business users have changed; unplanned downtime cannot be tolerated.
So what do you do when you find challenges with avoiding unplanned downtime? A common reaction may be thinking that more or better technology is needed. Although there may be opportunities to improve the technology footprint, this thinking may fall short. The IT culture may very well be where the primary focus should lie.
Have your employees adapted to the changing demands of the IT world? When an outage occurs, are they going beyond simply restoring service, driving to true root cause, and deploying solutions to prevent repeat occurrences? Do they recognize that the solution to prevent a repeat is not always a technical solution, but may be a gap in procedures, or a gap in the expectation for team members to follow procedures? Do they strive to understand the collaboration between all the moving parts in the environment, and how a change in one area may affect other areas? Are they figuring out ways to load and stress test the IT services they provide to the organization, or are they assuming such testing will be covered by someone else's efforts? Do they trust that the redundancy works as advertised, or are they finding innovative ways to simulate unexpected events in order to validate that everything behaves as expected when a failover occurs? If they're not doing these things, then they may not be aware of expectations.
How do you change the culture? The first step is communication, and lots of it. You need to communicate the objectives and the business imperatives- "why things must change." You need to communicate expectations relative to the objectives, particularly the behavioral expectations necessary to achieve the desired outcomes. There needs to be support from the top down. All must be on-board; continually pushing the future state vision downwards through their organizations. The message needs to be repeated often. One way to engage individuals is to create a short catchy name that everyone readily associates with the effort, yet a name that succinctly drives home the objectives.
Beyond communication, metrics are needed that are well-aligned with the objectives and easy to understand. The metrics need to be put in front of everyone on a regular basis as a continual reminder. If there is a wide divide between current state and desired end state, then the metric goals may need to be adjusted over time, making each adjustment somewhat of a stretch to achieve while remaining reasonably attainable. Don't underestimate the effort needed to routinely gather, format, and publish the metrics. You need to deploy the people, processes, and automation necessary to make the metric reporting sustainable.
When embarking on an effort to improve the reliability and resiliency of IT environment and business applications, having the right people, processes, and technology are all key elements to long term sustained success.
STAYING SAFE ONLINE
Author: Jeff Howe, Senior EA Architect
Every day it seems there are new reports of cybersecurity breaches and online account thefts. October is National Cybersecurity month and along with this event comes numerous articles on how to remain safe online. Below are a few practices that can help protect online accounts from being compromised by attackers.
Password managers have become an essential tool to help create strong passwords and manage access to online accounts. Key features include browser and mobile app integration to allow automatic capture of login details and the ability to automatically login. Multiple device and platform support can allow management and access to login details from all connected devices.
Another safeguard promoted by cybersecurity experts is enablement of multi-factor authentication. This process typically involves designating a secondary email address or device which receives a notification with a code that must be entered before a password reset is permitted. Enabling multi-factor authentication where supported can reduce the risk of an attacker gaining access and taking control of online accounts.
On mobile devices, security features should be enabled which require additional authentication before changes can be made to device settings. Do not use the same password or passcode which provides access to the device itself. Enable features which automatically lock devices after a specified time period in order to reduce the likelihood your device is compromised without your knowledge.
Security questions used to authenticate a password reset are another area of concern. Such questions are thought to provide information that only the account holder would know. Answers to questions such as, "What city were you born in" or "What is your oldest sibling's first name," can be discovered through internet searches or by reviewing social media profiles. Using the same thought process as with passwords, provide password-like answers rather than the actual answers. Use a password manager to store the questions and answers.
Reviewing passwords and enabling additional security features is time consuming. However, consider the time spent in preparing a strong defense worthwhile when compared to the time spent recovering from an attack which compromises online accounts.
CLOUDS OF MANY COLORS
Author: Thad Henry, Lead EA Architect
The concept of moving applications and/or processes to the cloud is a dream of every business. This move can result in cost reduction, higher mobility and scalability that can be challenging with current on premise ("on-prem") infrastructure solutions. Unfortunately many of these dreams have turned into nightmares. Every cloud platform is shaded in many colors and knowing that they are not all white and fluffy is something every business needs to understand. There are ways to make this process less daunting, but it does require you to do some homework first.
When trying to decide what is needed, evaluating the current topology is a great first step. Every map has a legend, and using colors to separate one important piece from another is helpful for guidance. Taking a critical piece of business and moving it to the cloud can have direct or indirect effects. Gaining better insight into this topology and how each application interacts with one another is the first step to a successful implementation. The first step is to identify the systems that feed or are consumed by the piece that is moving. Nothing beats an electronic scan to see how things interact, but this only gains part of the overall picture. Including the use of architectural resources is a good way to help identify pieces that are known to the environment, but may not be picked up on a scan. Even with these resources, they are never 100% accurate and need to be supported by other information. It is critical to know that even with an on premise solution the colors on the map vary greatly.
In addition to identifying integration points, you also need to consider intangibles such as administration of data, user connectivity, business partner impacts, security impacts and other key pieces to make the application work. If you move away from an on premise solution, how will these interact or work in the future? Keeping in mind that moving to the cloud does not mean these go away, they are just handled differently, and how they are handled can differ with each cloud vendor.
Now that you know what you want your piece of the cloud to be colored, understand what it is that separates one color from another. This process can be time consuming and requires dedication to the details. The differences between cloud platforms, even though they seem small, can have large impacts to your business. Items such as user administration are vastly different and sometimes require a lot of data from the existing on premise solution. These do take time to work through and coordinate.
Performing evaluations of on premise versus cloud is a key decision point when making platform investment decisions. If this is your first move to the cloud it is highly recommended that you find an experienced partner to help you with the journey. They will make the journey easier and more bearable now and in the future.
As Gilbert Chesterton said "There are no rules of architecture for a castle in the clouds".
TRANSPARENCY FOR HEALTH PLANS: CONSUMER ENGAGEMENT
Author: Dan Hatfield, Enterprise Architecture
Significant changes in the U.S. healthcare system have led consumers to take a greater responsibility in their healthcare. One of the challenges of engaging consumers to undertake this responsibility is the complexity of the health care ecosystem. In responding to the need for better transparency and engagement, providers have increased their focus on patient education. Payers and providers have spent time and resources ensuring that consumers have several means to access information, such as online content, search tools, options for providers, etc. Much of the focus has been on price and metrics and while these changes are vital, they are not the whole story. Just as important is engaging with consumers in ways that encourage them to manage their own health and well-being.
Health plans have looked to connect directly with consumers through enrollment, care management and healthy living offerings. Consumers interact with a dizzying array of family doctors, urgent care providers, pharmacists and specialists. The health plan ecosystem is also complex because it includes tiered networks, pharmacy, vision, and dental benefits, government and employer programs as well as high deductible and health savings plans. In order to engage effectively with consumers and provide superior health care and service in the future, providers and plans must bring together the information and interactions that are scattered across health plans and providers. The disparate information must be gathered in a timely manner so it can be utilized to drive consumer engagement at the appropriate times. By doing so, the consumer will experience health care services that anticipate their needs and simplify their interactions with the complex health care system.
In order to achieve the next level of integrated consumer engagement, information system investments in event collection and analytics systems are being made. The goal is to bring these disparate interactions together as they occur throughout provider and health plan systems. Once gathered, these interactions are run through new analytic capabilities that can identify consumer engagement opportunities. Using these information system investments to develop relationships with consumers is critical as health plans and providers seek to deliver superior health care and service.
TESTING WITH DE-IDENTIFIED PROTECTED HEALTH INFORMATION (PHI)
Author: Wade Donahue, Lead EA Architect
It seems all too frequent, another data breach involving protected health information (PHI). This has not only resulted in a greater focus on data protection by regulatory agencies, but consumers are now demanding greater levels of assurance that their personal information is being protected. As a result, many organization are making significant people, process, and technology investments to reduce the risk of a data breach. One way to mitigate risk is to test with production data that has been de-identified.
De-identification is a process of taking data fields that contain PHI and either removing them or replacing the contents with other realistic data. For example, phone numbers are replaced with fictitious phone numbers. The HIPAA Safe Harbor method has become a benchmark for the field types to be de-identified. These field types include name, address, phone numbers, Social Security Numbers, etc.
Moving to testing with de-identified data starts with communicating that goal to all levels of the organization. Make sure to communicate early and often using a variety of media. Now that everyone understands the goal, you need a software solution that provides a variety of methods for de-identifying data. Unless you have very clean data, make sure any solution chosen has custom logic capabilities to enable conditional de-identification.
As you are looking at solutions for de-identification, you'll find many of them provide capabilities to perform scans of your data and identify candidate data fields for de-identification. For example, the scan will identify a continuous string of nine numbers as an SSN. This type of scan only gets you started, as the software doesn't know what custom fields you may have created in that could identify an individual. This is where data owners and subject matter experts come in to play as they need to verify and augment the scan findings. By the way, did I mention communicating early and often? Let them know this is coming. The results of the verification process provides you with the information you need to configure the de-identification software.
Now that we have the software and have it configured, we are ready to start de-identifying data. It is important to allocate sufficient time to test the data de-identification process and applications because you will have unanticipated data issues to deal with. Unless you have very mature and automated test practices, be sure to allocate time in your plan to address your test case inventory to align the scenarios with the newly created set of de-identified data.
Moving away from testing with production data to testing with de-identified data is not only a technical problem, it requires a major culture shift for everyone involved in the development, testing and support of your applications. By using de-identified data for testing you have taken one step in reducing the risk of a data breach and meeting the expectations of regulatory agencies and healthcare consumers.
WEB APPLICATION SECURITY
Author: Vince Crose, Manager IT
The protection of personal data has always been a concern of consumers in the health industry. The prevalence of recent security breaches has brought further focus to this important topic. Consumers are demanding greater levels of security and assurances that their data is protected. This has brought about a greater focus on the "Principle of Least Privilege" - allow only access to information and resources that are necessary for legitimate purposes. What this really means is that users should only have the privilege and access to resources that are essential for them to complete their work.
In the Web application space, this has brought about the importance of "walling off - firewall" applications at a technology infrastructure level to prevent malicious users of the applications from being able to exploit any existing or future software vulnerabilities. While the use of firewalls for general web network traffic has been around for a long time, there is now an increased focus on using application specific firewalls. Web Application Firewalls (WAF) take the "walling off" one step further by plugging into the application layer and providing the ability to define rulesets and roles that can access specific applications.
In short, the protection of consumer data is going to require greater levels of controls around how data is accessed. The Web Application Firewall (WAF) technologies provide this ability at the infrastructure level to carry the Principle of Least Privilege beyond user access to the applications and software layers.
DATA PROTECTION: THE CENTER OF EVERYTHING WE DO
Author: Nate Dell, Manager, IT Infrastructure and Security
Defining Data Protection strategies is critical in today's digital and data driven business. Data protection strategy can be broken down into three distinct groups or functions. The first is tagging or labeling data based on specific data elements. Every organization may place different amounts of value on different data elements. For example, a finance institute may have a high value on bank account numbers or credit cards whereas a healthcare organization may care about member ID numbers or patient information. What is important is that data elements need to be simple and discoverable in both the unstructured and structured world.
Tagging data is not a trivial task especially in the unstructured world. For example, let's take a look at everyone's favorite nine digit number, the Social Security Number (SSN). Simply creating a regular expression that searches all ASCII file types for a 9 digit number will result in a large amount of false positives. This rule must be coupled by identifying multiple key common words, (SSN, Social, Social Security, Security Number, Membership ID number, etc.) within proximity, 4 excel columns or the first 1/3 of the page of a word document of that 9 digit number. Tagging this information in the structured format is inherently less complicated because the data is structured and typically labeled within a database. By first identifying a key word list for each data element one can simply use that list to search database column names. Note data tagging is a "set it and forget it" type activity; the technology used to identify this information will need constant minor adjustments as people develop new business document templates or bring new databases online.
Once the data has been tagged and labeled across the organization, a huge feat within itself, it must be classified into buckets. Note that some data elements like SSN can be classified in multiple categories, regulated and confidential, and by default should follow the control with the most rigors. This classification scheme will drive how technology is deployed. Data is protected in following three distinct states
Once tagged and classified, the data must now create processes and technical controls on how the data will be protected. For example, let's follow a document that can be found in most health care organizations. A word document tagged with data elements SSN, Member ID, First/Last Name, Address, and medical claims information. Now think about the life cycle of this document and how it must be protected in all three states.
- Data at Rest - Any document that contains these data elements must be housed on an encrypted storage medium. This means regardless of where this document is stored (Network Fire Share, laptops/Desktops, Database, USB Thumb Drives, etc.) all drives must be encrypted
- Data In Motion - Any time this document is transmitted it must be encrypted and can only be done by authorized personnel. This would mean before the document is transmitted to a new location authorization is needed. Once the first hurdle is passed, restrictions to approve encrypted communication channels to transfer this file (Encrypted Emails, IM, Web Uploads, Secure File Transport Protocols, etc) are required
- Data in Use - Only authorized personnel can read this document based on business reason and only on authorized systems. This means that rigorous access controls and monitoring needs to be set up so at any point the question of who/when/why can be answered.
This data protection strategy is a powerful framework and although it sounds simple, is one of the most complex strategies to institutionalize. Data protection impacts everyone and everything.
USING INNOVATIVE TECHNOLOGY TO ENABLE ADVANCED MEDICAL PROVIDER SEARCH
Author: Tim Barnickel, Lead Architect, Enterprise Architecture HM Health Solutions
Emerging innovative technology can now be applied to assist consumers to easily search for medical providers. A core aspect of this new advanced search engine technology supports "natural language like" and auto-suggest capabilities, similar to what end users are accustomed to with web search engines such as Google. A key requirement of this new approach, versus existing medical provider search capabilities, is that the consumer is not forced into a rigid and complex structured user interface. As an example of the new approach, a consumer can now input "foot doctor" into a general search box and the system will search for podiatrists, using advanced synonym capabilities, while also leveraging geospatial technology to further guide the search.
To complement advanced search engine technology, user interface (UI) design and technology has also rapidly evolved to enable consumers to conduct their search across devices with a wide range of capabilities and form factors, from smartphones through desktop computers. To provide a high quality UI, the "mobile first" design technique can be leveraged to optimize the consumer experience on a smartphone, while also leveraging responsive web design to ensure that the UI is rendered appropriately based upon the device's form factor. Furthermore, mobile phone geolocation capabilities assist with proximity based search. Advanced "open source" UI frameworks continue to rapidly evolve with new capabilities to facilitate the implementation of the UI layer.
HM Health Solutions has recently built a new provider search capability that leverages IBM Watson search technology, part of IBM's emerging "cognitive computing" platform. The UI was built with the popular open source framework, AngularJS from Google. Moving forward, HMHS will continue to evaluate advances in natural language search and other innovative technologies to further enhance provider search capabilities for its customers.