Install Zeppelin 0.7.3 on Windows

Raymond Tang Raymond Tang 0 8211 3.09 index 3/3/2018

This post summarizes the steps to install Zeppelin 0.7.3 in Windows environment.

Tools and Environment

  • GIT Bash
  • Command Prompt
  • Windows 10

Download Binary Package

Download the latest binary package from the following website:

http://zeppelin.apache.org/download.html

In my case, I am saving the file to folder: F:\DataAnalytics

UnZip Binary Package

Open Git Bash, and change directory (cd) to the folder where you save the binary package and then unzip:

$ cd F:\DataAnalytics

fahao@Raymond-Alienware MINGW64 /f/DataAnalytics $ tar -xvzf  zeppelin-0.7.3-bin-all.gz

After running the above commands, the package is unzip to folder: F:\DataAnalytics\zeppelin-0.7.3-bin-all

Run Zeppelin

Before starting Zeppelin, make sure JAVA_HOME environment variable is set.

JAVA\_HOME environment variable

JAVA_HOME environment variable value should be your Java JRE path.

https://api.kontext.tech/resource/e055281c-82b0-5006-a25c-60318c78535f

Start Zeppelin

Run the following command in Command Prompt (Remember to the path to your own Zeppelin folder):

cd /D F:\DataAnalytics\zeppelin-0.7.3-bin-all\bin

F:\DataAnalytics\zeppelin-0.7.3-bin-all\bin>zeppelin.cmd

Wait until Zeppelin server is started:

https://api.kontext.tech/resource/8e7eccb8-89b5-5f02-a2c4-5385ed76dfb9

Verify

In any of your browser, navigate to http://localhost:8080/

The UI should looks like the following screenshot:

https://api.kontext.tech/resource/028d8beb-3314-5be3-9d9e-66d877fd289e

Create Notebook

Create a simple note using markdown and then run it:

https://api.kontext.tech/resource/1f8c5f31-edca-5199-af30-dbfb73aac5a5

java.lang.NullPointerException

If you got this error when using Spark as interpreter, please refer to the following pages for details:

https://issues.apache.org/jira/browse/ZEPPELIN-2438

https://issues.apache.org/jira/browse/ZEPPELIN-2475

Basically, even you configure Spark interpreter not to use Hive, Zeppelin is still trying to locate winutil.exe through environment variableHADOOP_HOME.

Thus to resolve the problem, you need to install Hadoop in your local system and then add one environment variable:

https://api.kontext.tech/resource/4381dde5-722a-5ca4-b602-c922e5b8aa75

After the environment variable is added, please restart the whole Zeppelin server and then you should be able to run Spark successfully.

https://api.kontext.tech/resource/7447b6dc-21a3-58d4-af73-515ac2b59162

You should also be able to run the tutorials provided as part of the installation:

https://api.kontext.tech/resource/089153fb-2644-553e-85ba-0ad3e9d3ae28

org.apache.zeppelin.interpreter.InterpreterException:

If you encounter the following error:

org.apache.zeppelin.interpreter.InterpreterException: The filename, directory name, or volume label syntax is incorrect.

at org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:143) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.reference(RemoteInterpreterProcess.java:73) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:265) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:430) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:111) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:387) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

It is probably caused by the same issue in this JIRA task if you have installed Spark locally:

https://issues.apache.org/jira/browse/ZEPPELIN-2677

To fix it, you can remove ‘SPARK_HOME’ environment variable and your Spark should still be able to run correctly if you run spark shell using full path of spark-shell.cmd.

big-data-on-windows-10 spark zeppelin

Join the Discussion

View or add your thoughts below

Comments