MSDN -> Synapse -> QuickStarts -> 2 Analyze using serveless SQL pool
https://docs.microsoft.com/en-us/azure/synapse-analytics/get-started-analyze-sql-on-demand
In Synapse Studio, go to the Develop hub
Create a new SQL script.
Paste the following code into the script.
SQLCopy
SELECT TOP 100 * FROM OPENROWSET( BULK 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet', FORMAT='PARQUET' ) AS [result]
Click Run.
第三步,照着做会报错,contosolake看起来就是个假名字,报错信息如下
File 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet' cannot be opened because it does not exist or it is used by another process.
经过一番查找,问题的解决方法MSDN 写在了一般不会看的第一步中
MSDN -> Synapse -> QuickStarts -> 1 Creating a Synapse workspace
https://docs.microsoft.com/en-us/azure/synapse-analytics/get-started-create-workspace
Place sample data into the primary storage account
We are going to use a small 100K row sample dataset of NYX Taxi Cab data for many examples in this getting started guide. We begin by placing it in the primary storage account you created for the workspace.
- Download this file to your computer: https://azuresynapsestorage.blob.core.windows.net/sampledata/NYCTaxiSmall/NYCTripSmall.parquet
- In Synapse Studio, navigate to the Data Hub.
- Select Linked.
- Under the category Azure Data Lake Storage Gen2 you'll see an item with a name like myworkspace ( Primary - contosolake ).
- Select the container named users (Primary).
- Select Upload and select the
NYCTripSmall.parquet
file you downloaded.Once the parquet file is uploaded it is available through two equivalent URIs:
https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet
abfss://users@contosolake.dfs.core.windows.net/NYCTripSmall.parquet
In the examples that follow in this tutorial, make sure to replace contosolake in the UI with the name of the primary storage account that you selected for your workspace.
简言之就是先去https://azuresynapsestorage.blob.core.windows.net/sampledata/NYCTaxiSmall/NYCTripSmall.parquet这里下载这个示例数据,然后上传到自己的Storage中,然后把contosolake换成自己的Storage Account名字。当然,如果自己的container名字不是user,那么URL中container也需要替换掉。
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://{StorageAccountName}.dfs.core.windows.net/{ContainerName}/NYCTripSmall.parquet',
FORMAT='PARQUET'
) AS [result]