Using Hadoop through a SOCKS proxy? -
Then our Hadoop cluster runs on some nodes and can only be reached from these nodes. You do SSH and your work.
Since it is quite upset, but (obviously) nobody will be able to go to try to configure access control so that it is useful for some outside, I try the next best thing Use SSH to run a sock proxy in the cluster, i.e.:
$ ssh -D localhost: 10000 the.gateway cat
There are whispers of SOCKS support (naturally I have not received any document), and go Ira as it core-site.xml
:
& lt; Property & gt; & Lt; Name & gt; Fs.default.name & lt; / Name & gt; & Lt; Price & gt; HDFS: //reachable.from.behind.proxy: 1234 / & lt; / Value & gt; & Lt; / Property & gt; & Lt; Property & gt; & Lt; Name & gt; Mapred.job.tracker & lt; / Name & gt; & Lt; Price & gt; Reachable.from.behind.proxy: 5678 & lt; / Value & gt; & Lt; / Property & gt; & Lt; Property & gt; & Lt; Name & gt; Hadoop.rpc.socket.factory.class.default & lt; / Name & gt; & Lt; Price & gt; Org.apache.hadoop.net.SocksSocketFactory & lt; / Value & gt; & Lt; / Property & gt; & Lt; Property & gt; & Lt; Name & gt; Hadoop.socks.server & lt; / Name & gt; & Lt; Price & gt; Local Host: 10000 & lt; / Value & gt; & Lt; / Property & gt;
< / P>
I am just trying to run jobs, not administering cluster. I only need to access HDFS and deposit jobs through SDOS (SOCS) through SOCS (It seems that there is a completely different thing about using SSL / proxy between cluster nodes etc.; It should not be that my machine is not part of a cluster, just a customer.)
Is there any useful document on that? To illustrate my failure to do something useful: I am through strace -f
. We found the configuration value by running the client and examining the configuration files to be read.
Out of which is the description of the configuration value, even its response? (I've got a really void reference document, just a different old tutorial, I hope I'm missing something?)
Is there really any way to dump the configuration value that it uses? is?
The original code for this to be applied was added to
but this article Also notes that you have to change the socket class with socks
with
& lt; Property & gt; & Lt; Name & gt; Hadoop.rpc.socket.factory.class.default & lt; / Name & gt; & Lt; Price & gt; Org.apache.hadoop.net.SocksSocketFactory & lt; / Pricing & gt; & Lt; / Property & gt; Edit: Note that the properties are in different files:
- fs.default.name and hadoop.socks.server and hadoop.rpc.socket Factory.class.default is cor- site.xml
- mapped. JobTracker and Mapped Job Tractor. Mapred-site.xml (config for map-less)
Comments
Post a Comment