An attribute set holds additional metadata for each dataset, including categories and table details to enable you to connect to the data source directly from a FinSpace notebook. Categories allow for cataloging of datasets by commonly used business terms (such as source, data class, type, industry, and so on). In the next sections, we will update and run the notebooks to connect to Amazon Redshift and import datasets.Ĭonfigure your FinSpace catalog to describe your Amazon Redshift tablesįinSpace users can discover relevant datasets by using search or by navigating across categories under the Categories menu. If prompted, select FinSpace PySpark kernel. Navigate to amazon-finspace-examples/blogs/finspace_redshift-2021-09 and open both Jupyter notebooks that contain Amazon Redshift integration code.Paste the link to the repo with FinSpace examples. Choose Git on the left navigation panel and select “Clone a Repository”.On the FinSpace console, choose “Open Notebook”.The code provided in this blog post should be run from the FinSpace notebooks. Setup Amazon Redshift integration notebooksĭownload Jupyter notebooks covering Amazon Redshift dataset import and analysis. Create a table in Amazon Redshift and insert trading transaction data using these SQL queries.Additionally, create a superuser and ensure that the cluster is publicly accessible. Create an Amazon Redshift cluster in the same AWS account as the FinSpace environment.Ensure you have permissions to manage categories and controlled vocabularies and manage attribute sets in FinSpace.For instructions on creating a new environment, see Create an Amazon FinSpace Environment. The diagram below provides the complete solution overview.īefore you get started, make sure you have the following prerequisites: Use FinSpace notebooks to analyze data from both FinSpace and Amazon Redshift to evaluate trade performance based on the daily price for AMZN stock.Add description, owner, and attributes to each dataset to help with data discovery and access control. Populate the FinSpace catalog with tables from Amazon Redshift.Use FinSpace notebooks to connect to Amazon Redshift.Configure your FinSpace catalog to describe your Amazon Redshift tables.Setup Amazon Redshift integration notebooks.The blog post covers the following steps: We will do it by comparing our transactions stored in Amazon Redshift to trading history for the stock stored in FinSpace. Finally, we will evaluate how well-executed were our stock purchases. We then show how simple it is to use the FinSpace catalog to discover available data and to connect to an Amazon Redshift cluster from a Jupyter notebook in FinSpace to read daily trades for Amazon (AMZN) stock. In this post, we explore how to connect to an Amazon Redshift data warehouse from FinSpace through a Spark SQL JDBC connection and populate the FinSpace catalog with metadata such as schema details, dataset owner, and description. Analysts can then search the catalog for necessary datasets and connect to them using the FinSpace web interface or through the FinSpace JupyterLab notebook.Īmazon Redshift is a popular choice for storing and querying exabytes of structured and semi-structured data such as trade transactions. After connecting to data source or uploading it directly through the FinSpace user interface (UI), you can create datasets in FinSpace that include schema and other relevant information. To get started, FinSpace admins create a category and an attribute set to capture relevant external reference information such as database type and table name. They repeat this process for every additional dataset needed for the analysis.Īmazon FinSpace makes it easy for analysts and quants to discover, analyze, and share data, reducing the time it takes to find and access financial data from months to minutes. For example, to analyze daily trading activity, analysts need to find a list of available databases and tables, identify its owner’s contact information, get access, understand the table schema, and load the data. ![]() ![]() Finding the right dataset and getting access to the data can frequently be a time-consuming process. Financial services organizations use data from various sources to discover new insights and improve trading decisions.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |