26-01-16

AWS Database Choices, Use Cases and Characteristics

I’m currently studying for my AWS Certified Solutions Architect Pro exam and one of the main topics I always seem to have something of a blind spot about it databases. What I’ve done below is to try and summarise some points about RDS and the other database offerings on AWS as a quick reference guide for the exam. If it helps you either for the exam or in real life, great!

The notes were put together using AWS documentation and Re:Invent 2015 videos. If you spot any factual errors, please tweet me @ChrisBeckett and I’ll correct them.

Key differences between SQL and NoSQL

  • NoSQL is schema-less, easy reads and writes, simple data model
  • NoSQL scaling is easy
  • NoSQL focusses on performance and availability at any scale
  • SQL has a strong schema, complex relationships, transactions and joins
  • SQL scaling is difficult
  • SQL focusses on data consistency over scale and availability

What is DynamoDB?

  • NoSQL database offering
  • Fully managed by AWS (no need for separate EC2 instances, etc)
  • Single digit millisecond latency
  • Massive and seamless scalability at low cost

Use cases for DynamoDB

  • Internet of Things (Tracking data, real time notifications, high volumes of data)
  • Ad Tech (ad serving, ID lookup, session tracking)
  • Gaming (gaming leader boards, usage history, logs)
  • Mobile and Web (Storing user profiles, session data)
  • All above use cases require high performance, high scale and high volume

DynamoDB Characteristics

  • Automatically replicated (writes) across three Availability Zones in a single region, persisted to SSD. Write is confirmed when master copy and one replica is updated
  • Reads can be eventually or strongly consistent, no latency trade off, strongly consistent data read from master only
  • DynamoDB consists of tables, tables have items, items have attributes. Because there is no schema, hash keys or item IDs must be present to identify an entry in a table
  • In order to scale properly, you just tell DynamoDB what throughput you need and it will configure the infrastructure for you
  • Pay for the amount of storage used and the amount of throughput (reads and writes)
  • Free tier entitlement of 25GB and 60 million reads and 60 million writes per month. Very cost effective

What is RDS?

  • Fully managed relational databases (MySQL, PostgreSQL, SQL Server, Oracle, Aurora)
  • Fast, predictable performance
  • Simple and fast to scale
  • Low cost, pay for what you use

Use cases for RDS

  • Anything that requires a SQL back end, such as existing corporate applications
  • RDS supports VPC, high availability, instance scaling, encryption and read replicas for Aurora, MySQL, PostgreSQL
  • MySQL even supports cross region deployment
  • 6TB max storage limit for Oracle, PostgreSQL and MySQL. Aurora is 64TB and SQL Server 4TB
  • Scale storage for all except SQL Server
  • Provisioned IOPS 30,000 for MySQL, PostgreSQL, Oracle. 20,000 for SQL Server
  • Largest RDS instance supported is R3.8XL for all platforms

What is Aurora?

  • 99.99% availability
  • 5x faster than MySQL on the same hardware
  • Distributed storage layer, so scales better
  • Data replicated six times across three availability zones
  • 15 read replicas maximum
  • Fully MySQL compatible

RDS Characteristics

  • Supports three types of storage
    • General purpose SSD for most use cases
    • Provisioned IOPS for guaranteed storage performance of up to 30,000 IOPS
    • Magnetic for inexpensive very small workloads
  • Multi-AZ support – provides automatic failover, synchronous replication and is inexpensive and simple to set up
  • Can create read replicas in other regions, can promote to a master for easy DR or put data close to the users that use them
  • Pay for what you consume, some free tier entitlements exist for RDS (20GB of data storage, 20GB of backups, 10m IOPS, 750 micro DB instance hours)

What is ElastiCache?

  • In memory key value store
  • High performance
  • Choice of Redis or Memcached
  • Fully managed, zero admin

ElastiCache Use Cases

  • Caching layer for performance or cost optimisation of an underlying database
  • Storage of ephemeral key-value data
  • High performance application patterns such as session management, event counters, etc
  • Memcached is the simpler of the two. Cache node auto discovery and multi-AZ node placement
  • Redis is Multi-AZ with auto failover, persistence and read replicas
  • Redis handles more complex data types
  • Monthly bill is number of nodes * duration nodes were used for
  • Some free tier eligibility – 750 micro cache node hours

What is Amazon Redshift?

  • Relational data warehouse
  • Massively parallel, petabyte scale
  • Fully managed

Redshift Use Cases

  • Leverage BI tools, Hadoop, machine learning, streaming
  • Pay as you go, grow as you need
  • Analysis in line with data flows
  • Managed availability and data recovery

Redshift Characteristics

 

  • HDD and SSD platforms
  • Architecture has a leader node (endpoint for communication) and compute nodes. These are linked by 10Gbps networking
  • Leader node also stores metadata and optimises the query plan
  • Compute nodes have local columnar storage and have distributed, parallel execution of queries, backups, loads, restores, resizes
  • Backed up continuously and incrementally to S3 and across regions
  • Streamed restores
  • Tolerates disk failures, node failures, network failures and AZ/region failures
  • Leader node is free, compute nodes are charged as they are used

 

 

08-01-16

AWS Solutions Architect Associate Exam Experience

Solutions Architect-Associate

A little after the fact I know, but on December 21st I went up to Edinburgh (only seat I could get before Christmas) to sit the AWS Solutions Architect Associate exam. I have to say the level of rigour of the security checks was far in excess of anything I’ve ever seen before (roll up your pants legs please!) which was slightly amusing in and of itself.

Once I’d been thoroughly screened in reception, I went through into the exam room to sit the test. There were cameras in the room and also a proctor, which again is pretty unusual in my experience. The exam itself is pretty faithful to the blueprint published by Amazon, which you can find here.  It’s a very broad exam as you might expect – AWS has dozens of services you can use at your leisure, and you have to have a reasonable understanding of most (if not all of them).

Topic areas I got grilled on included S3, Glacier, EC2, Elastic Beanstalk, SQS, SNS and VPCs. I wouldn’t say you need a massive in depth knowledge of all of these areas, but there’s no such thing as too much preparation. I used Ryan Kroonenberg’s acloud.guru site, the videos are short and concise and represent good value for money. I bought my course originally through Udemy for £9, and they transferred my purchase over to acloud.guru for free.

There are also numerous AWS white papers, the security paper seems to be the favourite doc of most students I’ve seen. After a few pages of pre-amble, it does a pretty reasonable job of outlining all of the AWS services, what they do and what they’re for.

I’d had about six weeks of experience before sitting the exam, but if you are comfortable with virtualisation concepts, you should be OK. That being said, some AWS services have weird and wacky names that are not immediately memorable, so be prepared for that.

The exam itself is multiple choice. I can’t remember off hand, but I think it was around 80 to 85 questions and cost around £100. In the end, I knew it would be quite close but thankfully I got through with a 67% pass mark (I believe you need 65% for a pass). Although there were a lot of questions, I got through it in around 45 minutes, as some questions are very short and you either know the answer or you don’t.

I’m on now to the Solutions Architect Professional exam next month, and the air gets a little rarer up there as the topics are a lot more in depth. I’m putting together some study notes, so assuming I complete them in time and they’re reasonably accurate to the exam, I’ll post them back for the community to use. They won’t be a dump though, so don’t come looking for one.

Happy public clouding!

18-12-15

Amazon Web Services – A Technical Primer for VMware Admins

aws

Yes, yes, I know. Long time no blog. Still, isn’t it meant to be about quality and not quantity? That could spawn a million dirty jokes, so let’s leave it there. So to the matter in hand. Recently I’ve been working on a project that’s required me to have a much closer look at Amazon Web Services (or AWS for the lazy). I think probably like most I’ve heard the name and in my head just thought of it as web servers in the cloud and probably not much more than that. How I was wrong.

However, like most “cloud” concepts, because ultimately it’s based on the idea of virtualisation, it’s actually not that hard to get your head around what’s what and how AWS could be a useful addition to your armoury of solutions for all sorts of use cases. So with that in mind, I thought it would be really useful to put together a short article for folks who are dyed in the wool vSphere admins who might need to add an AWS string to their bow at some time in the near future. Let’s get started.

As you can see from the picture below, logging into the AWS console gives us a bewildering array of services from which to pick, most of which have exotic and funky names such as “Elastic Beanstalk” and “Route 53”. What I’m going to try and do here is to separate out (at a high level) the services AWS offers and how they kind of map into a vSphere world.

aws-1

The AWS Console

Elastic Compute Cloud (EC2)

Arguably the main foundation of AWS, EC2 is the infrastructure as a service element. Herein comes the first of the differences. We no longer refer to the VMs as VMs, but we now refer to them as “instances”. In much the same way we might define it in vRealize or vCD, there are sizes of instances, from nano up to 8 x extra large, which should cater for most use cases. Each instance type has varying sizes of RAM, numbers of vCPUs and also workload optimisations, such as “Compute Optimised” or “Storage Optimised”.

Additionally, instance images are referred to as AMIs, which stands for “Amazon Machine Image”. Similar in concept I suppose to an OVA or OVF. It’s a pre-packaged virtual machine image that can be picked from the service catalog to provision services for end users. As you might expect, AMIs include both Windows and Linux platforms and there is also an AWS Marketplace from where you can trial or purchase pre-packaged AMIs for specific applications or services. In the example screen shot below, you can see that when we go into the “Launch Instance” wizard (think “create a new VM”) we can choose from both Amazon’s service catalog but also the AWS Marketplace. Why re-invent the wheel? If the vendor has pre-packaged it for you, you can trial it and also use it on a pay-as-you-go basis.

aws-2

As you can see above, there is a huge amount from which to pick, and it’s very much the same in concept as the VMware Solution Exchange. What’s notable here is the billing concept. Whereas with vSphere we might be thinking in terms of a one off cost for a licence, with AWS, we need to start thinking about perpetual monthly billing cycles, which will also dictate whether or not AWS is suitable and represents value for money.

You can also take an existing AMI, perform some customisation on it (install your application for example) and then save this as an AMI that you can use to create new instances, but these AMIs are only visible to you, not others. I suppose the closest match to this is a template in vCenter. So again, many similarities, just different terminology and slight differences in workflows etc.

It’s also worth adding at this point before I move properly onto storage that the main storage platform is called EBS, or Elastic Block Storage. It’s Elastic because it can expand and contract, it’s Block because..well, it’s block level storage (think iSCSI, SAN etc.) and Storage because, well it’s storage. At this level, you don’t deal with LUNs and datastores, you just deal with the concept of an unlimited pool of storage, albeit with different definitions. In this sense, it’s similar to the vSphere concept of Storage Profiles.

Storage Profiles can help an administrator place workloads on the appropriate type of storage to ensure consistent and predictable performance. In AWS’s case, you have a choice of three – General Purpose, Provisioned IOPS and Magnetic. More on this in the storage section, but remember that EBS storage is persistent, so when an instance is restarted or powered off, the data remains. You can also add disks to an instance using EBS, for example if you wanted to create a software RAID within your instance.

You may also see references to Instance Storage. This is basically using storage on the host itself, rather than enterprise grade EBS storage. This type of storage is entirely transitory and only lasts for the lifetime of the instance section. Once the instance is powered off or destroyed (terminated in AWS parlance), the storage goes with it. Remember that!

One of the good things about EBS is that in the main, SSD storage is used. General Purpose is SSD and is used for exactly that. Provisioned IOPS is used mainly for high I/O workloads such as databases and messaging servers and Magnetic is spinning disk, so the cheapest of the cheapest and used for workloads with modest I/O requirements.

Amazon S3

So to another service with an exotic hipster name, Amazon S3. This stands for Simple Storage Service and is Amazon’s main storage service. This differs from EBS as it’s an object based file service, rather than block based, which I suppose is more like what vSphere admins are used to.

Amazon refers to S3 locations as “buckets”, and it’s easy to think of them as a bunch of folders. You can have as many buckets as you like and again this storage is persistent. You can upload and download content, set permissions and even publish static websites from an S3 bucket. It’s also worth noting that bucket contents are highly available by way of replication across the region availability zones, but more about that later. By using IAM (Identity and Access Management) you can allow newly provisioned instances to copy content from an S3 bucket say into a web server content directory when they are provisioned, so you are good to go as soon as the instance is.

You can also have versioning, multi-factor authentication and lifecycle policies, but that’s beyond the scope of this article.

It’s not easy to map S3 to a vSphere concept, so we’ll leave it here for now, but at least you know in broad terms what S3 is.

AWS Networking

One thing that AWS does very well (or very frustratingly, depending on your viewpoint) is hiding the complexity of networking  and simplifying into a couple of key concepts and wizards.

In vSphere, we have the concepts of vSwitches, VDSes, port groups, VLAN tags, etc. In AWS, you pick a VPC (more on that later), a subnet and whether or not you want it to have an internet facing IP address. That’s pretty much it.

In terms of configuring the networking environment, when you sign up to AWS you get a default VPC, this stands for “Virtual Private Cloud” and is what is says it is – your own little bubble inside of AWS that nobody can see but you (analogous to a vCloud Director Organisational DC). You can add your own VPCs (up to a limit of 5, for now) if you want to silo off different departments or lines of business, for example. Think of a VPC as your vCenter view, but without clusters. VPCs operate pretty much on a simple, flat management model. If you have a PluralSight sub, it’s a good idea to check out Nigel Poulton’s VPC videos for a much better insight on how this all works.

VPCs don’t talk to each other by default, but you can link them together (and link VPCs from other AWS accounts if you want to). Again, it’s difficult to map this to a vSphere concept,  but this helps explain what a VPC is.

Each instance will get an internal RFC 1918 type network address (say 10.x or 192.168.x, depending how CIDR blocks are configured) and those instances requiring external IP addresses will have this added transparently, so basically NAT because the VM does not know about the external facing address. I know it sounds a bit complicated, but actually it’s not, I’m just not good at explaining it!

Availability Zones

One last concept to cover is Availability Zones (AZ). Generally there are three per region, and right now there are 11 regions worldwide. You can put workloads wherever you like, but if you want to add things like Elastic Load Balancer, you can’t just scatter gun your instances all over the planet.

An AZ in it’s most basic sense is a physical data centre, so easy to understand from a vSphere perspective. However, in AWS, as there are three AZs per region connected together via high speed, low latency network links, services such as S3 and Elastic Load Balancer (ELB) can take advantage of this. The region is the logical boundary for these services and means that S3 data is replicated around all AZs in the region and load balanced services that sit behind a single ELB can be placed in all three AZs if need be. All of this is configured by default, you don’t need to do anything yourself to let this magic happen.

Managing AWS from vCenter

In all the AWS concepts I’ve mentioned so far, I’ve discussed how things are done from the AWS web console. It’s also possible to manage and migrate VMs to AWS from vCenter Server, this is done with the AWS Management Portal. I haven’t yet tried it, but when I do, I’ll come back and write an article about it. This is a key piece of the puzzle though, as it allows “single pane of glass” management for vSphere and AWS.

In Conclusion

Hopefully this has been a useful primer in mapping AWS concepts to vSphere ones. There are lots of services and constructs that are unique to AWS that don’t necessarily map back, but it’s still important to know what they are. I’ve summarised some of the mappings in the table below (and not all of them are directly 1-1 in concept), hopefully I can add more articles in the coming weeks.

Availability Zone = Data Centre (physical)

VPC = Datacenter (vCenter logical)

EBS = Storage Profiles (similar, but not exactly the same)

Instance = Virtual Machine

AMI = OVA/OVF

 

 

13-10-15

VMworld Europe Day Two

Today is pretty much the day the whole conference springs to life. All the remaining delegates join the party with the TAM and Partner delegates. The Solutions Exchange opened for business and there’s just a much bigger bustle about the place than there was yesterday.

The opening general session was hosted by Carl Eschenbach, and credit to him for getting straight in there and talking about the Dell deal. I think most are scratching their heads, wondering what this means in the broader scheme of things, but Carl reassured the delegates that it would still be ‘business as usual’ with VMware acting as an independent entity. That’s not strictly true, as they’re still part of the EMC Federation, who are being acquired by Dell, so not exactly the same.

Even Michael Dell was wheeled out to give a video address to the conference to try and soothe any nerves, giving one of those award ceremony ‘sorry I can’t be there’ speeches. Can’t say it changed my perspective much!

The event itself continues to grow. This year there are 10,000 delegates from 96 countries and a couple of thousand partners.

Into the guts of the content, first up were Telefonica and Novamedia. The former are a pretty well known European telco, and the latter are a multinational lottery company. The gist of the chat was that VMware solutions (vCloud, NSX etc) have allowed both companies to bring new services and solutions to market far quicker than previously. In Novamedia’s case, they built 4 new data centres and had them up and running in a year. I was most impressed by Jan from Novamedia’s comment ‘Be bold, be innovative, be aggressive’. A man after my own heart!

VMware’s reasonably new CTO Ray O’Farrell then came out and with Kit Colbert discussed the ideas behind cloud native applications and support for containers. I’ll be honest at this point and say that I don’t get the container hype, but that’s probably due in no small part to my lack of understanding of the fundamentals and the use cases. I will do more to learn more, but for now, it looks like a bunch of isolated processes on a Linux box to me. What an old cynic!

VMware have taken to approaches to support containers. The first is to extend vSphere to use vSphere Integrated Containers and the second is the Photon platform. The issue with containerised applications is that the vSphere administrator has no visibility into them. It just looks and acts like a VM. With VIC, there are additional plug-ins into the vSphere Web Client that allow the administrator to view which processes are in use, on which host and how it is performing. All of this management layer is invisible and non-intrusive to the developer.

The concept of ‘jeVM’ was discussed, which is ‘just enough VM’, a smaller footprint for container based environments. Where VIC is a Linux VM on vSphere, the Photon platform is essentially a microvisor on the physical host, serving up resource to containersa running Photon OS, which is a custom VMware Linux build. The Photon platform itself contains two objects – a controller and the platform itself. The former will be open sourced in the next few weeks (aka free!) But the platform itself will be subscription only from VMware. I’d like to understand how that breaks down a bit better.

VRealize Automation 7 was also announced, which I had no visibility of, so that was a nice surprise. There was a quick demo with Yangbing Li showing off a few drag and drop canvas for advanced service blueprints. I was hoping this release would do away with the need for the Windows IaaS VM(s), but I’m reliably informed this is not the case.

Finally, we were treated with a cross cloud vMotion, which was announced as an industry first. VMs were migrated from a local vSphere instance to a vCloud Air DC in the UK and vice versa. This is made possible by ‘stretching’ the Layer 21 network between the host site and the vCloud Air DC. This link also includes full encryption and bandwidth optimisation. The benefit here is that again, it’s all managed from a familiar place (vSphere Web Client) and the cross cloud vMotion is just the migration wizard with a couple of extra choices for source and destination.

I left the general session with overriding feeling that VMware really are light years ahead in the virtualisation market, not just on premises solutions but hybrid too. They’ve embraced all cloud providers, and the solutions are better for it. Light years ahead of Microsoft in my opinion, and VMware have really raised their game in the last couple of years.

My first breakout session of the day was Distributed Switch Best Practices. This was a pretty good session as I’ve really become an NSX fanboy in the last few months, and VDSes are the bedrock of moving packet between VMs. As such, I noted the following:-

  • DV port group still has a one to one mapping to a VLAN
  • There may be multiple VTEPS on a single host. A DV port group is created for all VTEPs
  • DV port group is now called a logical switch when backed by VXLAN
  • Avoid single point of failure
  • Use separate network devices (i.e switches) wherever possible
  • Up to 32 uplinks possible
  • Recommend 2 x 10v Gbps links,  rather than lots of 1 Gbps
  • Don’t dedicate physical up links for management when connectivity is limited and enable NIOC
  • VXLAN compatible NIC recommended, so hardware offload can be used
  • Configure port fast and BPDU on switch ports, DVS does not have STP
  • Always try to pin traffic to a single NIC to reduce risk of out of order traffic
  • Traffic for VTEPs only using single up link in an active passive configuration
  • Use source based hashing. Good spread of VM traffic and simple configuration
  • Myth that VM traffic visibility is lost with NSX
  • Net flow, port mirroring, VXLAN ping tests connections between VTEPs
  • Trace flow introduced with NSX 6.2
  • Packets are specially tagged for monitoring, reporting back to NSX controller
  • Trace flow is in vSphere Web client
  • Host level packet capture from the CLI
  • VDS portgroup, vmknic or up link level, export as pcap for Wireshark analysis
  • Use DFW
  • Use jumbo frames
  • Mark DSCP value on VXLAN encapsulation for Quality of Service

For my final session of the dayt, I attended The Practical Path to NSX and Network Virtualisation. At first I was a bit dubious about this session as the first 20 minutes or so just went over old ground of what NSX was, and what all the pieces were, but I’m glad I stayed with it, as I got a few pearls of wisdom from it.

  • Customer used NSX for PCI compliance, move VM across data center and keep security. No modification to network design and must work with existing security products
  • Defined security groups for VMs based on role or application
  • Used NSX API for custom monitoring dashboards
  • Use tagging to classify workloads into the right security groups
  • Used distributed objects, vRealize for automation and integration into Palo Alto and Splunk
  • Classic brownfield design
  • Used NSX to secure Windows 2003 by isolating VMs, applying firewall rules and redirecting Windows 2003 traffic to Trend Micro IDS/IPS
  • Extend DC across sites at layer 3 using encapsulation but shown as same logical switch to admin
  • Customer used NSX for metro cluster
  • Trace flow will show which firewall rule dropped the packet
  • VROps shows NSX health and also logical and physical paths for troubleshooting

It was really cool to see how NSX could be used to secure Windows 2003 workloads that could not be upgraded but still needed to be controlled on the network. I must be honest, I hadn’t considered this use case, and better still, it could be done with a few clicks in a few minutes with no downtime!

NSX rocks!

 

 

 

12-10-15

VMworld Europe Day One

Today saw the start of VMworld Europe in Barcelona, with today being primarily for partners and TAM customers (usually some of the bigger end users). However, that doesn’t mean that the place is quiet, far from it! There are plenty of delegates already milling around, I saw a lot of queues around the breakout sessions and also for the hands on labs.

As today was partner day, I already booked my sessions on the day they were released. I know how quickly these sessions fill, and I didn’t want the hassle of queuing up outside and hoping that I would get in. The first session was around what’s new in Virtual SAN. There have been a lot of press inches given to the hyper converged storage market in the last year, and I’ve really tried to blank them out. Now the FUD seems to have calmed down, it’s good to be able to take a dispassionate look at all the different offerings out there, as they all have something to give.

My first session was with Simon Todd and was titled VMware Virtual SAN Architecture Deep Dive for Partners. 

It was interesting to note the strong numbers of customer deploying VSAN. There was a mention of 3,000 globally, which isn’t bad for a product that you could argue has only just reached a major stage of maturity. There was the usual gratuitous customer logo slide, one of which was of interest to me. United Utilities deal with water related things in the north west, and they’re a major VSAN customer.

There were other technical notes, such as VSAN being an object based file system, not a distributed one. One customer has 14PB of storage over 64 nodes, and the limitation to further scaling out that cluster is a vSphere related one, rather than a VSAN related one.

One interesting topic of discussion was whether or not to use passthrough mode for the physical disks. What this boils down to is the amount of intelligence VSAN can gather from the disks if they are in passthrough mode. Basically, there can be a lot of ‘dialog’ between the disks and VSAN if there isn’t a controller in the way. I have set it up on IBM kit in our lab at work, and I had to set it to RAID0 as I couldn’t work out how to set it to passthrough. Looks like I’ll have to go back to that one! To be honest, I wasn’t getting the performance I expected, and that looks like it’s down to me.

VSAN under the covers seems a lot more complex than I thought, so I really need to have a good read of the docs before I go ahead and rebuild our labs.

There was also an interesting thread on troubleshooting. There are two fault types in VSAN – degraded and absent. Degraded state is when (for example) an SSD is wearing out, and while it will still work for a period of time, performance will inevitably suffer and the part will ultimately go bang. Absent state is where a temporary event has occured, with the expectation that this state will be recovered from quickly. Examples of this include a host (maintenance mode) or network connection down and this affects how the VSAN cluster behaves.

There is also now the ability to perform some proactive testing, to ensure that the environment is correctly configured and performance levels can be guaranteed. These steps include a ‘mock’ creation of virtual machines and a network multicast test. Other helpful troubleshooting items include the ability to blink the LED on a disk so you don’t swap out the wrong one!

The final note from this session was the availability of the VSAN assessment tool, which is a discovery tool run on customer site, typically for a week, that gathers existing storage metrics and provides sizoing recommendations and cost savings using VSAN. This can be requested via a partner, so in this case, Frontline!

The next session I went to was Power Play :What’s New With Virtual SAN and How To Be Successful Selling It. Bit of a mouthful I’ll agree, and as I’m not much of a sales or pre-sales guy, there wasn’t a massive amount of takeaway for me from this session, but Rory Choudhari took us through the current and projected revenues for the hyperconverged market, and they’re mind boggling.

This session delved into the value proposition of Virtual SAN, mainly in terms of costs (both capital and operational) and the fact that it’s simple to set up and get going with. He suggested it could live in harmony with the storage teams and their monolithic frames, I’m not so sure myself. Not from a tech standpoint, but from a political one. It’s going to be difficult in larger, more beauracratic environments.

One interesting note was Oregon State University saving 60% using Virtual SAN as compared to refreshing their dedicated storage platform. There are now nearly 800 VASN production customers in EMEA, and this number is growing weekly. Virtual SAN6.1 also brings with it support for Microsoft and Oracle RAC clustering. There is support for OpenStack, Docker and Photon and the product comes in two versions.

If you need an all flash VSAN and/or stretched clusters, you’ll need the Advanced version. For every other use case, Standard is just fine.

After all the VSAN content I decided to switch gears and attend an NSX session called  Disaster Recovery with NSX, SRM and vRO with Gilles Chekroun. Primarily this session seemed to concentrate on the features in the new NSX 6.2 release, namely the universal objects now available (distributed router, switch, firewall) which span datacentres and vCenters. With cross vCenter vMotion, VMware have really gone all out removing vCenter as the security or functionality boundary to using many of their products, and it’s opened a whole new path of opportunity, in my opinion.

There are currently 700 NSX customers globally, with 65 paying $1m or more in their deployments. This is not just licencing costs, but also for integration with third party products such as Palo Alto, for example. Release 6.2 has 20 new features and has the concept of primary and secondary sites. The primary site hosts an NSX Manager appliance and the controller cluster, and secondary sites host only an NSX Manager appliance (so no controller clusters). Each site is aware of things such as distributed firewall rules, so when a VM is moved from one site to another, the security settings arew preserved.

Locale IDs have also been added to provide the ability to ‘name’ a site and use the ID to direct routing traffic down specific paths, either locally on that site or via another site. This was the key takeway from the session that DRis typically slow, complex and expensive, with DR tests only being invoked annually. By providing network flexibility between sites and binding in SRM and vRO for automation, some of these issues go away.

In between times I sat the VCP-CMA exam for the second time. I sat the beta release of the exam and failed it, which was a bit of a surprise as I thought I’d done quite well. Anyway, this time I went through it, some of the questions from the beta were repeated and I answered most in the same way and this time passed easily with a 410/500. This gives me the distinction of now holding a full house of current VCPs – cloud, desktop, network and datacenter virtualisation. Once VMware Education sort out the cluster f**k that is the Advanced track, I hope to do the same at that level.

Finally I went to a quick talk called 10 Reasons Why VMware Virtual SAN Is The Best Hyperconverged Solution. Rather than go chapter and verse on each point I’ll list them below for your viewing pleasure:-

  1. VSAN is built directly into the hypervisor, giving data locality and lower latency
  2. Choice – you can pick your vendor of choice (HP, Dell, etc.) And either pick a validated, pre-built solution or ‘roll your own’ from a list of compatible controllers and hard drives from the VMware HCL
  3. Scale up or scale out, don’t pay for storage you don’t need (typically large SAN installations purchase all forecasted storage up front) and grow as you go by adding disks, SAS expanders and hosts up to 64 hosts
  4. Seamless integration with the existing VMware stack – vROps adapters already exist for management, integration with View is fully supported etc
  5. Get excellent performance using industry standard parts. No need to source specialised hardware to build a solution
  6. Do more with less – achieve excellent performance and capacity without having to buy a lot of hardware, licencing, support etc
  7. If you know vSphere, you knopw VSAN. Same management console, no new tricks or skills to learn with the default settings
  8. 2000 customers using VSAN in their production environment, 65% of whom use it for business critical applications. VSAN is also now third generation
  9. Fast moving road map – version 5.5 to 6.1 in just 18 months, much faster rate of innovation than most monolithic storage providers
  10. Future proof – engineered to work with technologies such as Docker etc

All in all a pretty productive day – four sessions and a new VCP for the collection, so I can’t complain. Also great to see and chat with friends and ex-colleagues who are also over here, which is yet another great reason to come to VMworld. It’s 10,000 people, but there’s still a strong sense of community.

10-08-15

VCIX-NV Exam Experience

vcix-nv

Last Thursday I went over to Leeds to sit the VCIX-NV exam. Obviously regular readers will know I haven’t been using NSX all that long (around 6 weeks, I’d say) and I’ve already managed to get the VCP out of the way, so I figured I needed a new challenge! As per usual, there are no exam questions listed as per the NDA, but if you’re thinking of doing this exam any time soon, I’d recommend it. Advanced exams are always a tough but rewarding experience.

The exam itself, as per the blueprint, is 18 questions with a selection of subtasks. Passing score is 300 out of 500 and obviously you can score points even when you don’t fully meet all question requirements. Total time allowed is 225 minutes, although I didn’t spend a lot of time clock watching until the end.

I’ve read a lot of people complain about latency issues, but I didn’t really see that during my sitting. I have a level of expectation that there will be latency anyway, and it wasn’t so severe that it really made much of a difference to me getting things done. I did have an issue with low colour on the screen, which is obviously a known issue as it was listed on the exam start screen. Again it didn’t prevent me performing any tasks, so I elected against disconnecting and reconnecting as recommended, I’m always paranoid that something bad will go wrong second time around!

The exam itself is very faithful to the blueprint, but as the blueprint is so wide in scope and there are only 18 questions, some areas were not covered at all, which you’d sort of expect. There was certainly nothing in there that I thought was not fair game.

About half way through I had a major issue where a host stopped responding. After informing the proctor and some phone calls to and fro between the test centre, Pearson and VMware, it was decided it was my fault and so therefore wouldn’t be fixed. I wasn’t sure I agreed with that assessment, but as things turned out it worked out in my favour, in a crazy way. Firstly, up to that point I’d been going quite slowly and not managing my time very well (a constant point when sitting VCAP/VCIX exams), so having 20 minutes out of the room to look at the host issue meant that when I went back in, the dead host issue meant a fire was lit under me to get things done quicker and in the end, the dead host had no effect on any other tasks I had to do (and I should add that Pearson did give me the time back on the exam timer).

I did miss one question out that I was saving to the end, but I ran out of time to come back to it. After hitting the finish button with seconds left, I got my score report back on Friday night (thanks again Josh @ VMware for pushing the scoring through) and much to my surprise and utter relief I passed with 300/500. Right on the limit, but a pass is a pass and the exam has helped me identify areas I need to strengthen, so a win-win all around.

In terms of study materials, let me recommend the following:-

The Hands On Lab environment is very similar to the exam environment and working through each exercise several times until you have it down pat is a really effective way of preparing for the exam. Remember during the exam that you can score points in a variety of ways, so make sure to read the question and complete as many tasks as you can, this was basically the key to me just about getting over the line. Even if it’s only one sub task out of three or four, if you can complete it, do it and add it to your total.

Finally, get to your exam centre in plenty of time, stay relaxed and don’t be intimidated! No idea what is next for me exam wise, I think I’ll probably have a breather and wait until the new VCIX-DCV and DTM are released, probably towards Christmas/New Year time.

 

30-07-15

VCP6-CMA Study Guide : Section 5: Allocate and Manage vRealize Automation Resources

VCP6-CMA-sm-logo_120_108

As I predicted in my last blog post, VMware have announced that starting at VMworld 2015 in August, it will be possible to schedule VCP6 exams such as VCP-DCV, VCP-DTM and VCP-CMA. Hopefully this will mean that my beta score for my CMA exam is not too far away now, it would be nice to get a full house of VCPs!

Anyway, also as per my last blog post, I’m publishing section 5 of the study guide, which is as far as I got. Unless I fail the beta and have to resit, I don’t envisage me having the time to go back and complete the remaining sections. Hopefully it will be of some use to people planning on having a go at the CMA, any feedback is welcome via Twitter as always.

Objective 5.1: Create and Manage Fabric Groups

Adding and configuring vSphere Endpoints

  • Creating an endpoint creates access to compute resources on a virtualised platform
  • The process involves creating a credential set, defining a cloud endpoint and mapping resources for consumption
  • Log in to the vRealize Automation console as an IaaS administrator.
  • Select Infrastructure > Endpoints > Credentials.
  • Click New Credentials.
  • Enter a name in the Name text box. (Optional) Enter a description in the Description text box.
  • Type the username in the User name text box.
    • Must be in domain\username format, for example mycompany\admin. The credentials must have permission to modify custom attributes
  • Type the password in the Password text boxes.
  • Click the Save icon (green tick)
  • Select Infrastructure > Endpoints > Endpoints.
  • Select New Endpoint > Virtual > vSphere.
  • Enter a name in the Name text box.
    • This must match the endpoint name provided to the vSphere proxy agent during installation or data collection fails.
  • (Optional) Enter a description in the Description text box.
  • Enter the URL for the vCenter Server instance in the Address text box.
  • Select the previously defined Credentials for the endpoint.
    • If your system administrator configured the vSphere proxy agent to use integrated credentials, you can select the Integrated credentials.
  • Only select Specify manager for network and security platform if you plan to integrate with an existing NSX or vCNS instance

Adding and configuring vRealize Automation endpoints

  • I’m assuming here that this refers to Orchestrator!
  • Same process as for vSphere endpoint, except you choose to create a vCO credential using administrator@vsphere.local (assuming using the vCO engine as part of the vRO appliance)
  • Create a new Orchestration endpoint for vCenter Orchestrator
  • Give it a meaningful, type in the address (typically https://vcoserver:8281/vco)
  • Select the appropriate vCO credential you just created
  • Add a custom property VMware.VCenterOrchestrator.Priority and set it to 1. This is mandatory.

Map compute resources to endpoints

  • A compute resource is an object that represents a host, host cluster, or pool in a virtualization platform, a virtual datacenter, or an Amazon region on which machines can be provisioned.
  • An IaaS administrator can add compute resources to or remove compute resources from a fabric group.
  • A compute resource can belong to more than one fabric group, including groups that different fabric administrators manage.
  • After a compute resource is added to a fabric group, a fabric administrator can create reservations on it for specific business groups. Users in those business groups can then be entitled to provision machines on that compute resource
  • Compute resources such as storage and networking can be assigned from endpoints to Business Groups
  • Reservations are used to carve up resource from compute resources to apply to a Business Group

Assign correct permissions to manage Fabric Groups

  • An IaaS administrator can organize virtualization compute resources and cloud endpoints into fabric groups by type and intent. One or more fabric administrators manage the resources in each fabric group.
  • Fabric administrators are responsible for creating reservations on the compute resources in their groups to allocate fabric to specific business groups. Fabric groups are created in a specific tenant, but their resources can be made available to users who belong to business groups in all tenants.
  • Fabric administrators are created and assigned when creating the Fabric Group
  • A Fabric Administrator can do the following:-
    • Manage build profiles
    • Manage compute resources
    • Manage cost profiles
    • Manage network profiles
    • Manage Amazon EBS volumes and key pairs
    • Manage machine prefixes
    • Manage property dictionary
    • Manage reservations and reservation policies

Perform compute resource data collection

  • vRealize Automation collects data from both infrastructure source endpoints and their compute resources.
  • Data collection occurs at regular intervals. Each type of data collection has a default interval that you can override or modify.
  • IaaS administrators can manually initiate data collection for infrastructure source endpoints and fabric administrators can manually initiate data collection for compute resources.
  • To perform a manual data collection, Log in to the vRealize Automation console as an IaaS administrator.
  • Select Infrastructure > Endpoints > Endpoints
  • Point to the endpoint for which you want to run data collection and click Data Collection.
  • Click Start.
  • (Optional) Click Refresh to receive an updated message about the status of the data collection you initiated.
  • Click Cancel to return to the Endpoints page
  • There are seven different types of data collection:-
    • Infrastructure Source Endpoint Data Collection (Updates information about virtualization hosts, templates, and ISO images for virtualization environments. Updates virtual datacenters and templates for vCloud Director. Updates regions and machines provisioned on them for Amazon. Updates installed memory and CPU count for physical management interfaces.)
    • Inventory Data Collection (Updates the record of the virtual machines whose resource use is tied to a specific compute resource, including detailed information about the networks, storage, and virtual machines. This record also includes information about unmanaged virtual machines, which are machines provisioned outside of vRealize Automation.)
    • State Data Collection (Updates the record of the power state of each machine discovered through inventory data collection. State data collection also records missing machines that vRealize Automation manages but cannot be detected on the virtualization compute resource or cloud endpoint.)
    • Performance Data Collection (vSphere compute resources only) (Updates the record of the average CPU, storage, memory, and network usage for each virtual machine discovered through inventory data collection)
    • vCNS inventory data collection (vSphere compute resources only) (Updates the record of network and security data related to vCloud Networking and Security and NSX, particularly information about security groups and load balancing, for each machine following inventory data collection)
    • WMI data collection (Windows compute resources only) (Updates the record of the management data for each Windows machine. A WMI agent must be installed, typically on the Manager Service host, and enabled to collect data from Windows machines.)
    • Cost data collection (compute resources managed by vRealize Business Standard Edition only) (Updates the CPU, memory, and storage costs for each compute resource managed by vRealize Business Standard Edition. The costs of catalog items that can be provisioned by using the compute resources are updated.)

Perform resource monitoring tasks

Resource Monitoring Scenario Privileges Required Location
Monitor the amount of physical storage and memory on your compute resources that is currently being consumed and determine what amount remains free. You can also monitor the number of reserved and allocated machines provisioned on each compute resource Fabric Administrator (monitor resource usage on compute resources in your fabric group) Infrastructure > Compute Resources > Compute Resources
Monitor physical machines that are reserved for use but not yet provisioned. Fabric Administrator Infrastructure > Machines > Reserved Machines
Monitor machines that are currently provisioned and under vRealize Automation management Fabric Administrator Infrastructure > Machines > Managed Machines
Monitor the amount of storage, memory, and machine quota of your reservation that is currently allocated and determine the capacity that remains available to the reservation Fabric Administrator (monitor resource usage for reservations on your compute resources and physical machines) Infrastructure > Reservations > Reservations
Monitor the amount of storage, memory, and the machine quota that your business groups are currently consuming and determine the capacity that remains on reserve for them. Tenant Administrator (monitor resource usage for all groups in your tenant)Business Group Manager (monitor resource usage for groups that you manage) Infrastructure > Groups > Business Groups

Objective 5.2: Create and Manage Reservations

Create and Manage Reservations

  • Before members of a business group can request machines, fabric administrators must allocate resources to them by creating a reservation.
  • Each business group must have at least one reservation for its members to provision machines of that type.
  • Log in to the vRealize Automation console as a fabric administrator
  • A tenant administrator must create at least one business group
  • Select Infrastructure > Reservations > Reservations
  • Select New Reservation > Virtual and select the type of reservation you are creating
  • (Optional) Select an existing reservation from the Copy from existing reservation drop-down menu.
  • Data from the reservation you chose appears, and you can make changes as required for your new reservation
  • Select a compute resource on which to provision machines from the Compute resource drop-down menu.
  • Only templates located on the cluster you select are available for cloning with this reservation.
  • The reservation name appears in the Name text box.
  • Enter a name in the Name text box
  • Select a tenant from the Tenant drop-down menu.
  • Select a business group from the Business group drop-down menu.
    • Only users in this business group can provision machines by using this reservation
  • (Optional) Select a reservation policy from the Reservation policy drop-down menu.
    • This option requires additional configuration. You must create a reservation policy
  • (Optional) Type a number in the Machine quota text box to set the maximum number of machines that can be provisioned on this reservation.
    • Only machines that are powered on are counted towards the quota. Leave blank to make the reservation unlimited.
  • Type a number in the Priority text box to set the priority for the reservation.
    • The priority is used when a business group has more than one reservation. A reservation with priority 1 is used for provisioning over a reservation with priority 2.
  • (Optional) Deselect the Enable this reservation check box if you do not want this reservation active.
  • (Optional) Add any custom properties

Specify Reservation Information

  • A reservation is a share of provisioning resources allocated by the fabric administrator from a fabric group and reserved for use by a particular business group
  • A virtual reservation is a share of the memory, CPU, networking, and storage resources of one compute resource allocated to a particular business group.
  • Each reservation is for one business group. A business group can have multiple reservations on a single compute resource. A business group can also have multiple reservations on compute resources of different types.
  • A physical reservation is a set of physical machines reserved for and available to a particular business group for provisioning.

Create and Manage a Cloud Reservation

  • A cloud reservation provides access to the provisioning services of a cloud service account for a particular business group.
  • A group can have multiple reservations on one endpoint or reservations on multiple endpoints.
  • A reservation may also define policies, priorities, and quotas that determine machine placement.
  • The reservation must be of the same platform type as the blueprint from which the machine was requested
  • The reservation must be enabled
  • The reservation must have capacity remaining in its machine quota or have an unlimited quota.
    • The allocated machine quota includes only machines that are powered on. For example, if a reservation has a quota of 50, and 40 machines have been provisioned but only 20 of them are powered on, the reservation’s quota is 40 percent allocated, not 80 percent
  • The reservation must have the security groups specified in the machine request.
  • The reservation must be associated with a region that has the machine image specified in the blueprint.
  • For Amazon machines, the request specifies an availability zone and whether the machine is to be provisioned a subnet in a Virtual Private Cloud (VPC) or a in a non-VPC location. The reservation must match the network type (VPC or non-VPC).
  • If the cloud provider supports network selection and the blueprint has specific network settings, the reservation must have the same networks.
    • If the blueprint or reservation specifies a network profile for static IP address assignment, an IP address must be available to assign to the new machine.
  • If the blueprint specifies a reservation policy, the reservation must belong to that reservation policy.
    • Reservation policies are a way to guarantee that the selected reservation satisfies any additional requirements for provisioning machines from a specific blueprint. For example, if a blueprint uses a specific machine image, you can use reservation policies to limit provisioning to reservations associated with the regions that have the required image.
  • If no reservation is available that meets all of the selection criteria, provisioning fails.