Casey Karst, Program Manager II; Data Systems, announced in a post that "PolyBase in SQL Server 2016 and later can connect to Hadoop clusters with the hadoop.rpc.protection configuration set to Integrity, Privacy or Authentication."
Supporting this configuration allows PolyBase to connect and query Hadoop clusters that have wire encryption turned on. This enables a secure connection between Hadoop and SQL Server; as well as, among the Hadoop Data Nodes.
To connect to a Hadoop cluster with the hadoop.rpc.protection set to privacy or integrity, user will need to alter the core-site.xml file that is installed with PolyBase. This file is generally found at C:\ProgramFiles\MicrosoftSQLServer\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf.
Users are required to add a new property with the name hadoop.rpc.protection that is set to a a value of either privacy or integrity, so as to utilize this new configuration.
"These values must match the hadoop.rpc.protection configuration on your Hadoop cluster.
- <!-- RPC Encryption information, PLEASE FILL VALUE IN ACCORDING TO HADOOP CLUSTER CONFIG -->
- <property>
- <name>hadoop.rpc.protection</name>
- <value>privacy</value>
- </property>
When changing XML files it is important to ensure that input value is correct and maintain the validity of the XML file format. If the changes are invalid, PolyBase will not run."
Casey informed that this functionality is available by default in all SQL Server installations that currently have PolyBase installed, so it does not require a fresh update. However, currently, the functionality is not available in Azure SQL Data Warehouse, Azure SQL Database, or Analytics Platform System.