Pythian Blog: Technical Track

Cloudscape podcast 2 in review - a deeper dive into the latest in Microsoft Azure

I recently joined Chris Presley on episode 2 of his Cloudscape podcast to share what is new in the world of Microsoft Azure and w e discussed the following:
  • Cosmos DB’s Default Encryption
  • Azure VNet Endpoints for SQL DB
  • Azure Event Grid GA Release
  • Azure M series VM’s
  • Cosmos DB Graph API GA Release
Cosmos DB's default encryption We started out by discussing Cosmos DB’s default encryption at rest. As you may know, Cosmos DB, Azure SQL DB and Azure SQL Data Warehouse are all encrypted at rest by default. And it is not just default, Microsoft has decided that data engine related services have to be encrypted with no option to opt out. This is necessary for Microsoft likely because of liability and compliance issues but it is also positive for the clients. Given the importance and increasing coverage of data security regulation like GDPR, anything that the provider can do to help customers be compliant is a good idea. Of course, this is still a shared responsibility, so while the service is encrypted at rest, it is still up to the client to mask or encrypt sensitive fields on the database itself. Azure VNet endpoints for SQL DB The lack of VNet endpoints has been a thorn in the side of Azure SQL database, and specifically to those of us who are data professionals trying to migrate people to the service. The reason? For a very long time, Azure did not have a way to restrict access to your Azure SQL database from only inside a particular virtual network. And this was a big gap when compared to AWS for example. Amazon already has full support for VPC, so you can make an Amazon Relational Database Service (RDS) instance accessible only for specific VPC Azure SQL Database did not have this functionality until now. Until now, it was all public endpoints protected by firewall IP rules. As you can imagine, this is very cumbersome to manage because if any IP changes, you’d have to go back and refresh your rules and deal with compliance. For some clients, this was a non-starter because their security standards didn’t allow for public endpoints even with a firewall. The VNet end point makes it so that you can associate one particular VNet with an Azure SQL Database. Any resource inside the VNet can see the Azure SQL database. You can also filter it down further, so only specific resources inside that VNet can touch a particular Azure SQL Database. If you choose to do so, then you can kill all Internet access to the database. Azure event grid GA release Event Grid is an event routing service, so it’s similar to how Amazon has Simple Notification Service ( SNS). Many people use Amazon SNS for the same purpose, but the difference with Event Grid is that it’s more tightly coupled directly into the services so you don’t need to use an intermediate service to hook it up. The main way to go serverless in Azure is through Azure functions, but rather than doing it this way, they plan to integrate the services that consume directly into Event Grid so you can use a service like Logic App. For those not familiar with Logic App, it does serverless computing through a Microsoft Graphical Interface. Rather than having direct functions, this would take the event and forward it to Logic App. This way, routine operations that require very simple compute can be achieved completely with no code with Event Grid and a Logic App for example. To date, I haven’t seen a lot of widespread use of Logic App because, in my opinion, many people don’t realize what the service can do or how powerful it really is, but like all of these services, they take a while to reach wide adoption. Overall, I think a service like Event Grid will make it even easier to start using other services. This is about making serverless more powerful in Azure and increasing the capacity of what you can build without deploying infrastructure.   Azure M-Series virtual machines Azure announced the M-Series which is the biggest virtual machine you can now get in Azure compute. This machine will give you 128 VCPUs and up to four terabytes of RAM. You can hook up eight network cards and get up to 30 gigabits of bandwidth. Its IO bandwidth is in the 160,000 IOPS range. It’s a massive, expensive and powerful machine. The purpose of this machine seems to be for people to run either really big in memory database workloads or to entice people to host SAP HANA on Azure. There is a lot of competitive pressure because providers will continue ramping up the largest VM that you can put in the cloud which I think is good for consumers in the end.   Cosmos DB Graph API GA release The thing with graph databases is that they seemed to have stayed in the realm of academic exercises for the most part. Even though the modeling and the semantics are powerful, when it’s time to move into productions we have performance, high availability, encryption and security to consider. This is what Cosmos DB is trying to streamline for Graph. If you need to design something to represent relationships in social, relationships in terms of hardware topology, solutions for routing, graph is more natural to interact with than relational. Then put it on top of Cosmos Db and you immediately get the HA, geo-replication, elasticity, etc. The reason why people have been building these systems on relational is because relational is the default hammer that everybody reaches for but it doesn’t necessarily lend itself to really good modeling for these types of solutions. It also doesn’t lend itself to solving some of the native graph problems like route traversals and path optimization. For those problems you usually up with a monster SQL query that nobody really understands. The key point here is that Cosmos DB enables you to have graph data modeling experience, but at the same time, it maintains all the production grade capabilities built into Cosmos such as your replication, the request units and the encryption. Hopefully, this will enable a new generation of graph applications. This is something we don’t see very often, so I’m hoping that we’ll see adoption.   *** This was a summary of the Azure topics we discussed during the podcast, Chris also welcomed Greg Baker (Amazon Web Services), and John Laham (Google Cloud Platform) who also discussed topics related to their expertise.   Listen to the full conversation here and be sure to subscribe to the podcast to be notified when a new episode has been released.

No Comments Yet

Let us know what you think

Subscribe by email