Bring your Data Securely to the Cloud by Implementing Column Level security, Row Level Security & Dynamic Data Masking with Azure Synapse Analytics

Azure Synapse Analytics from Microsoft is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. SNP helps its customers migrate their legacy data warehouse solutions to Azure Synapse Analytics to gain the benefits of an end-to-end analytics platform that provides high availability, security, speed, scalability, cost savings, and industry-leading performance for enterprise data warehousing workloads.

A common business scenarios we cover:

As organizations scale, data grows exponentially. And with the workforce working remotely, data protection is one of the primary concerns of organizations around the world today. There are several high-level security best practices that every enterprise should adopt, to protect their data from unauthorized access. Here are our recommendations to help you prevent unauthorized data access.

The SNP solution:

With Azure Synapse Analytics, SNP provides its customers enhanced security with column level security, row-level security & dynamic data masking.

Azure Synapse SecurityBelow is an example of a sample table data which is required to implement the column level security, row-level security & dynamic data masking for your data.

Revenue table:

Azure Synapse Security

Codes:

Step:1 Create users

create user [CEO] without login;

create user [US Analyst] without login;

create user [WS Analyst] without login;

 

Column Level Security

A column-level security feature in Azure Synapse simplifies the design and coding of security in applications. It ensures column-level security by restricting column access to protect sensitive data.

In this scenario, we will be working with two users. The first one is the CEO, who needs access to all company data. The second one is an Analyst based in the United States, who does not have access to the confidential Revenue column in the Revenue table.

Follow this lab, one step at a time to see how Column-level security removes access to the revenue column to US Analyst.

 

Step:2 Verify the existence of the “CEO” and “US Analyst” users in the Data Warehouse.

SELECT Name as [User1] FROM sys.sysusers WHERE name = N’CEO’;

SELECT Name as [User2] FROM sys.sysusers WHERE name = N’US Analyst’;

 

Step:3 Now let us enforce column-level security for the US Analyst.

The revenue table in the warehouse has information like Analyst, CampaignName, Region, State, City, RevenueTarget, and Revenue. The Revenue generated from every campaign is classified and should be hidden from US Analysts.

REVOKE SELECT ON dbo.Revenue FROM [US Analyst];

GRANT SELECT ON dbo.Revenue([Analyst], [CampaignName], [Region], [State], [City], [RevenueTarget]) TO [US Analyst];

Azure Synapse SecurityThe security feature has been enforced,  where the following query with the current user as ‘US Analyst’, this will result in an error. Since the US Analyst does not have access to the Revenue column the following query will succeed since we are not including the Revenue column in the query.

Azure Synapse SecurityAzure Synapse Security

Row Level Security

Row-level Security (RLS) in Azure Synapse enables us to use group membership to control access to rows in a table. Azure Synapse applies the access restriction every time data access is attempted from any user.

In this scenario, the revenue table has two Analysts, US Analysts & WS Analysts. Each analyst has jurisdiction across a specific Region. US Analyst on the South East Region. An Analyst only sees the data for their own data from their own region. In the Revenue table, there is an Analyst column that we can use to filter data to a specific Analyst value.

SELECT DISTINCT Analyst, Region FROM dbo.Revenue order by Analyst ;

Review any existing security predicates in the database

SELECT * FROM sys.security_predicates

 

Step:1

Create a new Schema to hold the security predicate, then define the predicate function. It returns 1 (or True) when a row should be returned in the parent query.

CREATE SCHEMA Security

GO

CREATE FUNCTION Security.fn_securitypredicate(@Analyst AS sysname)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN SELECT 1 AS fn_securitypredicate_result

WHERE @Analyst = USER_NAME() OR USER_NAME() = ‘CEO’

GO

Step:2

Now we define a security policy that adds the filter predicate to the Sale table. This will filter rows based on their login name.

CREATE SECURITY POLICY SalesFilter 

ADD FILTER PREDICATE Security.fn_securitypredicate(Analyst)

ON dbo.Revenue

WITH (STATE = ON);

Allow SELECT permissions to the Sale Table.

GRANT SELECT ON dbo.Revenue TO CEO, [US Analyst], [WS Analyst];

 

Step:3

Let us now test the filtering predicate, by selecting data from the Sale table as ‘US Analyst’ user.

Azure Synapse SecurityAs we can see, the query has returned rows here. Login name is US Analyst and Row-level Security is working.

Azure Synapse Security

Azure Synapse Security

Dynamic Data Masking

Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. DDM can be configured on designated database fields to hide sensitive data in the result sets of queries. With DDM the data in the database is not changed. Dynamic data masking is easy to use with existing applications since masking rules are applied in the query results. Many applications can mask sensitive data without modifying existing queries.

In this scenario, we have identified some sensitive information in the customer table. The customer would like us to obfuscate the Credit Card and Email columns of the Customer table to Data Analysts.

Let us take the below customer table:

Azure Synapse SecurityConfirmed no masking enabled as of now,

Azure Synapse Security

Let us make masking for Credit card & email information,

Step:1

Now let us mask the ‘CreditCard’ and ‘Email’ Column of the ‘Customer’ table.

ALTER TABLE dbo.Customer 

ALTER COLUMN [CreditCard] ADD MASKED WITH (FUNCTION = ‘partial(0,”XXXX-XXXX-XXXX-“,4)’);

GO

ALTER TABLE dbo.Customer

ALTER COLUMN Email ADD MASKED WITH (FUNCTION = ’email()’);

GO

 

Now, the results show masking enabled for data:

Azure Synapse SecurityExecute query as User ‘US Analyst’, now the data of both columns is masked,

Azure Synapse SecurityUnmask data:

Azure Synapse Security

Conclusion:

From the above samples, SNP has shown how column level security, row level security & dynamic data masking can be implemented in different business scenarios. Contact SNP Technologies for more information.

Top 5 FAQs on Operationalizing ML Workflow using Azure Machine Learning

Enterprises today are adopting machine learning and artificial intelligence (AI) at a rapid pace to stay ahead of their competition, deliver innovation, improve customer experiences, and grow revenue. However, the challenges with such integrations is that the development, deployment and monitoring of these models differ from the traditional software development lifecycle that many enterprises are already accustomed to.

Leveraging AI and machine learning applications, SNP helps bridge the gap between the existing state and the ideal state of how things should function in a machine learning lifecycle to achieve scalability, operational efficiency, and governance.

SNP has put together a list of the top 5 challenges enterprises face in the machine learning lifecycle and how SNP leverages Azure Machine Learning to help your business overcome them.

Q1. How much investment is needed on hardware for data scientists to run complex deep learning algorithms?

By leveraging Azure Machine Learning workspace, data scientists can use the same hardware virtually at a fraction of the price. The best part about these virtual compute resources is that businesses are billed based on the amount of resources consumed during active hours thereby reducing the chances of unnecessary billing.

Q2: How can data scientists manage redundancy when it comes to training segments and rewriting existing or new training scripts that involves collaboration of multiple data scientists?  

With Azure data pipelines, data scientists can create their model training pipeline consisting of multiple loosely coupled segments which are reusable in other training pipelines. Data pipelines also allows multiple data scientists to collaborate on different segments of the training pipeline simultaneously, and later combine their segments to form a consolidated pipeline.

Q3. A successful machine learning life cycle involves a data scientist finding the best performing model by using multiple iterative processes. Each process involves manual versioning which results to inaccuracies during deployments and auditing. So how best can data scientists manage version controlling?

Azure Machine Learning workspace for model development can prove to be a very useful tool in such cases. It tracks performance metrics and functional metrics of each run to provide the user with a visual interface on model performance during training. It can also be leveraged to register models developed on Azure Machine Learning workspace or models developed on your local machines for versioning. Versioning done using Azure Machine Learning workspace makes the deployment process simpler and faster.

Q4. One of the biggest challenges while integrating the machine learning model with an existing application is the tedious deployment process which involves extensive manual effort. So how can data scientists simplify the packaging and model deployment process?

Using Azure Machine Learning, data scientists and app developers can easily deploy Machine Learning models almost anywhere. Machine Learning models can be deployed as a standalone endpoint or embedded into an existing app or service or to Azure IoT Edge devices.

Q5. How can data scientists automate the machine learning process?

A data scientist’s job is not complete once the Machine Learning model is integrated into the app or service and deployed successfully. It has to be closely monitored in a production environment to check its performance and must be re-trained and re-deployed once there is sufficient quantity of new training data or when there are data discrepancies (when actual data is very different from the data on which your model is trained on and is affecting your model performance).

Azure Machine Learning can be used to trigger a re-deployment when your Git repository has a new code check-in. Azure Machine Learning can also be used to create a re-training pipeline to take new training data as input to make an updated model. Additionally, Azure Machine Learning provides alerts and log analytics to monitor and govern the containers used for deployment with a drag-drop graphical user interface to simplify the model development phase.

Start building today!

SNP is excited to bring you machine learning and AI capabilities to help you accelerate your machine learning lifecycle, from new productivity experiences that make machine learning accessible to all skill levels, to robust MLOps and enterprise-grade security, built on an open and trusted platform helping you drive business transformation with AI. Contact SNP here.

Azure’s Software Defined Connectivity — Virtual WAN

The hybrid cloud network consists of both physical and virtualized technologies to provide connectivity across Cloud, private data centers, on-premises, and branch offices. To help customers with their massive modernization efforts, SNP leverages the Azure Virtual WAN to build and deploy applications while simplifying branch connectivity. 

Azure Virtual WAN:

Azure’s Virtual WAN is software-defined connectivity that allows you to take advantage of optimized and automated branch connectivity on a global scale with Azure. Virtual WAN provides a better networking experience by seamlessly connecting branches to Azure with SDWAN & VPN devices (i.e., Customer Premises Equipment or CPE) with built-in ease of use and configuration management. It also provides security and routing functionalities to provide a single operational interface.

  • Branch connectivity (via connectivity automation from Virtual WAN Partner devices such as SD-WAN or VPN CPE).
  • Site-to-site VPN connectivity.
  • Remote user VPN connectivity (point-to-site).
  • Private connectivity (ExpressRoute).
  • Intra-cloud connectivity (transitive connectivity for virtual networks).
  • VPN ExpressRoute inter-connectivity.
  • Routing, Azure Firewall, and encryption for private connectivity.

 

How it works:

Traffic from branches goes into Microsoft’s network at the Microsoft edge site which is closest to a given branch office. Currently, there are 130 of these sites in the Microsoft global network. Once traffic is within the network, it can terminate one of your Virtual WAN’s virtual hubs. These hubs cover different Azure public Cloud regions around the world.

 

Azure’s Virtual WAN offers benefits like:

  • Integrated connectivity solutions in hub and spoke: Automate site-to-site configuration and connectivity between on-premises branch office and an Azure hub.
  • Automated spoke setup and configuration: Connect virtual networks and workloads to the Azure hub seamlessly.
  • Intuitive troubleshooting: Ability to see the end-to-end flow within Azure, and then use this information to take required actions.
  • Massive scalability with software-defined connectivityConnect global branch offices, point-of-sale locations, and sites using Azure and the Microsoft global network.
  • Optimize security and agility: Leverage secure transport network services and integrated firewall capabilities to ensure the secure delivery of all applications across your hybrid enterprise. Securely identify and manage the performance of today’s modern and encrypted applications running over SSL, TLS, and HTTPS.
  • One place for managing your network: Quickly respond to the needs of your business with application-centric, business intent-based policies that are centrally managed and applied network-wide across all remote locations.
  • Reduced costs: Maximize the use of redundancy and lower-cost connectivity options with zero-touch provisioning and centralized management to reduce the cost of deploying and maintaining a hybrid WAN architecture.
  • Reliability: Create a highly available WAN architecture that virtualizes and dynamically leverages multiple links at remote locations. Retain end-to-end visibility of network performance and end-user experience for troubleshooting and problem resolution.
  • Performance: Deliver superior application performance to your business with the industry-leading WAN optimization solution from SNP.

 

For more information on Azure Virtual WAN, contact SNP Technologies here.