Datawaves is a powerful tool for blockchain research. Datawaves have structured the Ethereum data so that you can use this data directly in your browsers via a notebook interface for data wrangling, data visualization, exploratory data analysis, and machine learning.
skill
- Python
- Basics of Pandas
- Basics of Jupyter Notebook
- Basics of data science package(sklearn/Altair/… the walk-through covers)
Workshop Topic and Structure
Part 1: Datawaves
Intro: motivations for Datawaves
- What’s Datawaves
- Basic Function:Data,Notebook,Share and Fork
- Difference with Colab and Dune
Intro: Spark
- What is Spark and Spark SQL?
- What is Spark Dataframe?
- Difference between Spark DataFrame and Pandas Dataframe
- How to transport a Spark DataFrame to a Pandas Dataframe?
Part 2: Data visualization with Datawaves
- Introduction to the problem(TBD)
- Notebook interface: Running cells, restarting & interrupting kernel.
- Issues may occur when using notebook
- Data tables and querying with SQL magic
- how tolist databases and tables in SQL
- how to create a temporary view from a dataframe
- how to print schema, columns, and rows of a dataframe
- Data preparation (TBD)
- Intro to nft.trades table
- Writing Markdown
Part 3: Diving into Linear Regression
Linear Regression(TBD)
- Introduction to Linear Regression
- Intro to the problem
- Data preparation
- Code walkthrough
We plan to hold this workshop in early may.
Related Link