AWS Notes: 2nd Annual Startup Launches

I always love hearing startup pitches and launches.

Koality
First was Koality, a build deployment service built, of course, on top of AWS. Koality automatically parallelizes your test suite across available virtual machines, making tests up to 64x faster. They not only do pre-push unit testing (blocking bad changes), they make it easy to have private debug instances. They have a great client list with Dropbox, CrunchBase and most recently Airbnb.

CardFlight
Next was CardFlight, the open platform for mobile payments. They announced two new features, custom manual entry and integration with Braintree. Free 12-month subscription to their readers.

Runscope
Some former Twilio guys created RunScope, which is a traffic inspector that aims to make it easier and better to integrate with APIs (like AWS APIs for example). Their debugging tools have tracked over 25 million API calls to date. I liked their motto: Everything is going to be 200 OK. Their new product announcement was Runscope Radar which is for adding automated API calls to your testing suite with assertions that can evaluate the request responses.

SportXast
SportXast is the easiest way to create, view and share family sporting moments. You can do things like easily get instant replays. You can easily connect to a community around an event and share crowdsourced highlights around the players you care about. When a user uploads a video it goes to S3 to SQS to Elastic Transcoder to CloudFront and back to other users. This was a true launch as this was their very first release to users.

Nitrous.IO
Nitrous is a free cloud-based development environment platform with a web-based IDE and cloud VMs. Their big differentiator from competitors is to reduce latency (via CloudFront) so that it is indistinguishable from localhost. They offer Google Docs-style collaboration. And of course, they’re hiring. (As is everybody!)

Hopefully one or more of these companies become super successful so I can say I was there when they launched!

AWS Notes: Scaling a Mobile Web App to 100 Million Clients and Beyond

For me, this was the best session of the conference so far. Joey Parsons, Head of Operations at Flipboard gave a talk about how they grew the company from their first user through to today.

He started by covering Flipboard’s “prototype phase” going to 100 million users. They started with a simple stack of Rails, EC3, S3, RDS, MongoDB and memcache. They submitted to the app store, launched on the iPad and monitored the Cloudwatch analytics. Then after some initial celebration, they noticed their CPUs were spiking. They spun up new servers and quickly hit the limit on their AWS account. Then they got rate limited on Twitter and Facebook. They made a call that night to rate limit their users to keep things in check, slowly opening the service up for new users.

They soon made a decision to switch to Java instead of Rails and add CloudFront to their stack. They also broke up what was once a monolithic app into separate microservices. They shifted their primary data store to MySQL via Amazon RDS. They started focusing on instrumentation and monitoring. Every instance they have was kept track of in a SimpleDB table with detailed information on the instance. This allowed them to do fast and powerful lookups of all the servers that power the operations of their company.

The next milestone for Flipboard was when they launched their iPhone app. Once again, after some initial celebration, they ran into some unforeseen performance problems as it scaled. In one night, with RDS, they were able to build a sharding mechanism that they still use to this day. Funnily enough though, the sharding didn’t matter for their problem — it all came down to one bad query that they fixed and everything went back to working great.

Their next launch was Android, and there were no bad stories to report there. Their stack continued to grow, adding HBase, Hadoop, Redis, Puppet and more.

They continued to focus heavily on instrumentation for all their services. They set up processing mechanisms with Hadoop, Storm and Kafka. They moved away from deploying with custom bash scripts, switching to Puppet. The most important thing though, he said, was to move away from just throwing hardware at problems and instead focus on using the appropriately sized EC2 instance both for best performance and cost savings.

chartTheir focus on instrumentation was not confined to server-side. Flipboard monitors a number of metrics on the client side by sending reporting data (such as how long it takes to open the app) from their apps to Graphite. They like the tool for metrics from hosts, apps, usage and logging. He gave props to the Cloudwatch2Graphite open source project that brings Cloudwatch metrics into Graphite. They divide their deployment into groups and use CloudWatch metrics to catch errors before they deploy. He had a neat slide of a pretty chart that they generate from that data using d3.js and cubism.js that can allow them to quickly see which parts of their stack may be causing performance problems.

What’s next for Flipboard technology? Better use of auto scaling groups, by dialing into lots of signals for better predictive analysis, continued heavy focus on picking the right instance types and taking advantage of any new AWS products.

He concluded with a philosophy that I share which is that the unknown is not scary, but rather it is exciting.

AWS Notes: Amazon Workspaces

This session was about the new Workspaces product that Amazon launched at the first day AWS re:Invent keynote. First he covered what customer problems they hope to solve with this. First, was to deliver desktop virtualization to tablets. Also, enable workforces to be more flexible and lower the cost for remote worker infrastructure.

End users can access their VM from laptop, iPad, Kindle Fire, Android. In his demo, AWS General Manager Gene Farrell grabbed an iPad that was running Windows, opened up Word and edited a document. It was interesting to see their UI for enabling a Windows 7 PC experience on a touchscreen tablet (via a radial touch menu). It integrates with Active Directory so users can access their organization’s intranet and so forth.

Interestingly, there is no data on the virtual device, it only delivers pixels and everything is stored on S3.

AWS Notes: Zero to Sixty with AWS Elastic Beanstalk

I ran a little late to this one, which is unfortunate because it was a really good one. I got there as Ann Wallace, Solutions Architect at Nike, was in the middle of her slides.

It seems Nike has a similar setup to AuctionsByCellular actually, with a Java stack built on top of AWS, deploying with Elastic Beanstalk (EBS). They went over their EBS deployment process and how they configure their environment. They do zero downtime deployments in much the same way we do by swapping the cname of the old environment with the new. They use .ebextensions to customize their EBS configuration. They showed their template.json file and an example .ebextension. She went over some of the problems that exist with the EBS deploy process (I’m sure Amazon is taking notes)

VTex is a large SAAS E-Commerce Platform in Brazil serving Latin America. Geraldo Thomaz, co-Founder and co-CEO, talked about how their use of EBS and AWS has evolved. They now have over 60 applications running on EBS. They did a quick demo of how easy it is to do releases with git, which again matched the deploy process we employ at ABC. They have a philosophy of doing many deployments of smaller size, multiple times a day. They even created a command line wrapper to further automate their deployment process. They use a Splunk .ebextension to monitor performance to make sure when they push new versions that there are no performance hits.

AWS re:Invent: Keynote Day Two

keynote

Amazon CTO Werner Vogels is quite a character. He talked about how there are so many products and announcements, it can be confusing. “Rapid delivery is in our DNA.” Werner repeated a theme that I had been hearing over and over again at the conference, that Amazon puts the customer at the center of everything they do. He said that when they start to evaluate new products, the first thing they do is write a press release and an FAQ before they write any code. They achieve rapid delivery by having small, autonomous “two pizza” teams that own their roadmap.

He then announced Amazon RDS for Postgres to much applause.

It was no surprise to see Netflix on stage as they may be AWS’s top and best known customer. Chief Product Officer Neil Hunt talked about all their open source projects. Chief Cloud Architect Adrian Cockcroft then announced the winners of the Netflix Cloud Prize winners. My favorite was the project that added the additional ability to torture servers to the chaos monkey.

Werner said we often think of innovation as creating “new stuff”, but often times the best innovations occur in improvements around things that don’t change and will help a customer forever. He said they focus on performance, security, reliability, cost savings and scale.

The next AWS announcement was I2 instances, SSD servers that have ridiculous read/write speeds, which in turn will enhance DynamoDB’s already fantastic performance.

Next up on stage was Ilya Sukhar, co-founder and CEO of Parse which offers a SDK that makes it easy to create apps across all devices. Parse is powering 180K applications with push notifications and API requests. He called out Elastic Beanstalk’s PIOPS as being particularly important to their success by delivering consistent DB performance to their apps.

Werner came back to cover security (he announced finer grained access controls and encryption for DynamoDB and other AWS products), cost savings (new bid-based pricing on the allocation and operations of AWS services).

Last, he spoke about scaling, citing WeTransfer as an AWS powered company that is a platform for transferring large files via email and also popular for serving wallpapers for artists.

Mike Curtis, VP of Engineering at Airbnb came out to talk about his company and its growth. From day one, it was built on AWS and their policy is that whenever AWS has a product that can solve one of their problems, they use it. They have over 1000 EC2 instances and 50TB of S3 Storage for photos. And they do it with a 5 person operations team, only possible because they are able to leverage AWS.

Werner came back on and showed a neat AWS-powered product called Narrative, which is a lifelogging camera that takes a photo every 30 seconds and sends it to S3 to store.

The next speaker was Dropcam CEO Greg Duffy. Dropcam is a Wi-Fi monitoring camera and cloud service for your home. They are now the largest inbound video service on the internet, with even more video being uploaded to it than YouTube. Without the cost savings provided by AWS, their company could not exist.

Werner came back to talk more about companies using AWS like Moovit, DeConstruction, Netflix and Echo.

Then, he announced yet another new service Amazon Kinesis, for real-time processing of streaming data at massive scale. This enables things like realtime analytics of data. It integrates with other AWS products, like DynamoDB, S3, RDS, etc. It was demo’d by Khawaja Shams. He showed an example of using it to do data exploration of tweets on twitter and trends via complex queries on large historical datasets. Undoubtedly this will be a very popular tool in the big data space.

AWS Notes – AWS Storage and Database Architecture Best Practices

AWS re:Invent SessionAWS Enterprise Solutions Architect Siva Raghupathy started by stating that 2.7 zettabytes (ZB) of data exists in the digital universe today. There will be 450 billion transactions per day by 2020. Most data is unstructured text.

How should we be handling all this data? It is about finding the right tool for the job. He broke down the AWS services into different categories based on the types of problems being solved.

There are primitive compute and storage options, kind of like a hard disk, that add flexibility because you can host any major data storage technology but come with operational burdens.

Next there are managed AWS services, for complex vs. simple queries and structured vs. unstructured data. He included blob stores like S3 and Glacier where you are storing unstructured data that isn’t attached to any query.

AWS Storage from hot to coldHe often asks his customers the question, “What is the temperature of your data?” Hot data is smaller, with low latency and a very high request rate. Cold data is vast, mostly static and infrequently requested. Warm data is somewhere in between. He then mapped the various AWS storage services, from hot to cold.

He spoke about cost conscious design, and then demonstrated the concept with an example. He fired up the AWS simple monthly calculator to figure out the correct AWS data storage service to use based on the cost. In his example, one would first think S3 was the appropriate solution, but after running it through the calculator we saw that because of all the small objects, DynamoDB was a better solution at less than 10% of the cost. You can use the AWS calculator to validate your architecture design. The best design is the one that will cost the least.

You can get further savings by moving data from one store to another as it cools down.

Next he moved on to talking about the AWS database services, starting with RDS. He said to use it for transactions and complex queries, but not for massive numbers of read/writes or simple queries that can be better handled by NoSQL. Furthermore, it is necessary to pick the right RDS DB instance class.

When to use DynamoDB? He said pretty much whenever you can. The only times you wouldn’t use it is for complex queries and transactions or for cold data. For DynamoDB best practices, keep item size small, store large blobs in S3 with metadata in DynamoDB and use a hash key for extremely high scale.

Last, he spoke about ElastiCache for speeding up reads/writes by caching frequent queries. Redis in particular is quite popular, but noted that it is not a good option for when data persistence is important.

He quickly wrapped things up going over the AWS unstructured data text search tool CloudSearch(don’t use as a replacement for a database), Redshift data warehouse service for complex queries on large quantities of historical data (copy large data sets from S3 or DynamoDB) and MapReduce (the “swiss army knife” for parallel scans of huge datasets).

AWS Notes – Scalable Media Processing

I have some background in working with delivering media via the web at my previous job, for example when I created BigVideo.js.

“For any given TV show, the footage shot will have been converted into over 1000 different formats”

No question, video is a pain in the ass. I was anxious to find out how AWS can make it less so. Clearly you can use it to make a great custom architecture solution for ingesting, processing and serving, with products like Amazon Elastic Transcoder, S3 multipart APIs, SQS and Elemental Cloud. But… still a pain in the ass. I expect eventually they will launch their own service similar to Brightcove. Unfortunately, not just yet.

AWS Notes – Building Cloud-Backed Mobile Apps

This session was about streamlining sign-in with social login, storing user data and more.

AWS Software Engineer Glenn Dierkes spoke about how to use the S3 Transfer Manager for iOS and Android that can allow you to pause, resume and cancel uploads to S3 and provide upload failure tolerance to your apps and websites. It does this with multipart uploads that can be done asynchronously in the background.

Next he spoke about the Geo Library for Amazon DynamoDB. It stores geo-hash indexes on the backend and allows you to do location and proximity lookups.

He then gave a larger overview of the AWS Mobile SDK’s and how to use them.

Last, he covered SNS mobile push notifications and how it simplifies messaging across mobile devices.

In case you didn’t know, AWS has open sourced quite a lot of their code on Github, so if you are a fan or user, give the AWS Github account a good browse.

AWS Notes – Dynamic Content Acceleration: Lightning Fast Web Apps

Now more than ever, having performant applications is essential for users. Every year, Forrester does a study showing how much response time a user will tolerate before abandoning an app. Every year that number shrinks, going from 4 seconds to now somewhere less than one second. A 1 second delay results in a 7% revenue loss.

Prasad Kalyanaraman, VP of AWS Edge Services, covered caching static or re-usable content, which is something most apps typically do. Then he talked about web apps can use Amazon CloudFront to deliver an entire website, including dynamic, static, streaming, and interactive content using a global network of edge locations.

Next, AWS Solution Architect Parviz Deyhim came out to get more into the technical details of how to improve web app performance. He first talked about the importance of looking at Waterfall Graphs.

One key thing is to look beyond HTML, CSS and JavaScript for cacheable resources. Even API calls may be able to cached for hours, minutes. Even caching assets for a second can be valuable, especially if you are dealing with a heavy API getting 1000’s of requests/second. CloudFront can cached any content with a query string, and every unique query string is cached object.

Parviz went on to detail a number of ways to use CloudFront to optimize delivery speed of content from your server to your user – definitely worth looking into adding to your tech stack if you haven’t already!

AWS re:Invent: The Keynote

AWS re:Invent is a learning conference. They host a diverse spectrum of companies are startups, mid-size companies and large enterprise.

AWS has been in the market over 7 years. Andy Jassy, SVP of AWS gave a broad overview of the many offerings that Amazon has made available. He said the thing his team is most proud of is the pace at which they’ve been able to roll out new products, with 235 products now on the market, with many more to come in the next 6 weeks. Their client list is pretty ridiculous — Netflix, Microsoft, Adobe, Heroku, LinkedIn, Dropbox, tumblr, Heroku, Oracle, and the list goes on.

He talked at length about their security features, and how they make continuous improvements, obtaining federal and health dept certifications, especially for their largest enterprise customers, which trickle down to benefit all their customers, large and small.

He introduced a new offering in the security space, Cloudtrail, which logs all API calls to a service and store on S3.

Next, he talked about the AWS pricing philosophy, where the more customers they have, the more usage, the more infrastructure they need, which leads to economies of scale that they can then lower prices and get more customers. AWS has had 38 Price Reductions since 2006.

He then tied back into the them of the conference: Reinvention. AWS is able to conduct all sorts of experiments with cloud computing that enterprises can’t do because of their cost. Amazon is able to experiment often and fail without risk, enabling customers to rapidly build products with these services with deep capabilities.

Andy introduced Jeff Smith, the CEO of The Suncorp Group. Jeff spoke about innovation. He said we constantly underestimate our ability to solve problems. The biggest constraint we face is the constraint of our own thoughts. I loved his Charles Kettering slide:

Andy Jassy came back on stage and emphasizing that AWS’s energy is focused on what customers want, which led him to making a new product announcement: Amazon Workspaces, cloud desktop virtualization, with access through the browser of tablets.

kettering

Next to the stage was a VP from Dow Jones who talked about how they are migrating much of their operation to AWS. He got the most laughs of the keynote when he made a direct appeal to the developers and designers in the audience to join his team. I think it is almost a rule that every speaker has to have a portion of their talk dedicated to recruiting.

Andy Jassy came back to announce a new AWS mobile app development product called AppStream. AppStream uses EC2 to render and compute the user experience then provide HD video quality application streaming to deliver apps to lower end mobile devices.

Two more speakers spoke about how they use AWS, one to create a platform for the SEC to monitor and review all stock market activity to prevent flash crashes and the other (Atomic Fiction) to create amazing imagery with a AWS-powered render farm for big budget Hollywood movies like Star Trek Into Darkness.