Quantcast
Channel: Hue, the self service open source Analytics Workbench for browsing, querying and visualizing data interactively
Viewing all articles
Browse latest Browse all 171

Beta of new Notebook Application for Spark & SQL

$
0
0

Last year we released Spark Igniter to enable developers to submit Spark jobs through a Web Interface. While this approach worked, the UX left a lot to be desired. Programs had to implement an interface, be compiled beforehand and YARN support was lacking. We also wanted to add support for Python and Scala, focusing on delivering an interactive and iterative programming experience similar to using a REPL.

notebook-1

 

This is for this that we started developing a new Spark REST Job Server that could provide these missing functionalities. On top of it, we revamped the UI for providing a Python Notebook-like feeling.

Note that this new application is pretty new and labeled as ‘Beta’. This means we recommend trying it out and contributing, but its usage is not officially supported yet as the UX is going to evolve a lot!

This post describes the Web Application part. We are using Spark 1.3 and Hue master branch.

 

 

Based on a new:

Supports:

  • Scala
  • Python
  • Java
  • SQL
  • YARN

 

If the Spark app is not visible in the ‘Editor’ menu, you will need to unblacklist it from the hue.ini:

 

[desktop]
app_blacklist=

 

On the same machine as Hue, go in the Hue home:

 

If using the package installed:

cd /usr/lib/hue

 

If using Cloudera Manager:

cd /opt/cloudera/parcels/CDH/lib/
HUE_CONF_DIR=/var/run/cloudera-scm-agent/process/-hue-HUE_SERVER-id
echo $HUE_CONF_DIR
export HUE_CONF_DIR

 

And start the Spark Job Server:

./build/env/bin/hue livy_server

 

You can customize the setup by modifying these properties in the hue.ini:

[spark]
# URL of the REST Spark Job Server.
server_url=http://localhost:8090/

# List of available types of snippets
languages='[{"name": "Scala", "type": "scala"},{"name": "Python", "type": "python"},{"name": "Impala SQL", "type": "impala"},{"name": "Hive SQL", "type": "hive"},{"name": "Text", "type": "text"}]'

# Uncomment to use the YARN mode
## livy_server_session_kind=yarn

 

Next

This Beta version brings a good set of features, a lot more is on the way. In the long term, we expect all the query editors (e.g. Pig, DBquery, Phoenix…) to use this common interface. Later, individual snippets could be drag & dropped for making visual dashboards, notebooks could be embedded like in Dropbox or Google docs.

We are also interested in getting feedback on the new  Spark REST Job Server and see what the community thinks about it (contributions are welcomed ;).

As usual feel free to comment on the hue-user list or @gethue!

 


Viewing all articles
Browse latest Browse all 171

Trending Articles