Apache Spark : ERROR : socket.gaierror: [Errno -2] Name or service not known

Getting Error “socket.gaierror: [Errno -2] Name or service not known”  while executing “./bin/pyspark”


[root@ip-10-0-0-28 spark]# ./bin/pyspark
Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
/root/spark/spark/python/pyspark/sql/context.py:487: DeprecationWarning: HiveContext is deprecated in Spark 2.0.0. Please use SparkSession.builder.enableHiveSupport().getOrCreate() instead.
DeprecationWarning)
16/12/06 13:58:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
File "/root/spark/spark/python/pyspark/shell.py", line 43, in
spark = SparkSession.builder\
File "/root/spark/spark/python/pyspark/sql/session.py", line 169, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/root/spark/spark/python/pyspark/context.py", line 294, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/root/spark/spark/python/pyspark/context.py", line 115, in __init__
conf, jsc, profiler_cls)
File "/root/spark/spark/python/pyspark/context.py", line 174, in _do_init
self._accumulatorServer = accumulators._start_update_server()
File "/root/spark/spark/python/pyspark/accumulators.py", line 259, in _start_update_server
server = AccumulatorServer(("localhost", 0), _UpdateRequestHandler)
File "/usr/lib64/python2.6/SocketServer.py", line 412, in __init__
self.server_bind()
File "/usr/lib64/python2.6/SocketServer.py", line 423, in server_bind
self.socket.bind(self.server_address)
File "", line 1, in bind
socket.gaierror: [Errno -2] Name or service not known

Then,

Just go to “/etc/hosts” of the server and add line
127.0.0.1 localhost

From the logs it looks like pyspark is unable to understand host localhost.Please check your /etc/hosts file , if localhost is not available , add an entry it should resolve this issue.

e.g:

[Ip] [Hostname] localhost

In case you are not able to change host entry of the server edit /python/pyspark/accumulators.py line number 269 as below

server = AccumulatorServer((“[server host name from hosts file]”, 0), _UpdateRequestHandler)

YACassandraPDO : How to fetch UUID column from cassandra table

It is observer that when we fetch the UUID field using YACassandraPDO driver in PHP it fetches the garbage. for example

select dateof(uuidfield) as theTimeStamp from table;

output will be :

array(1) {
[0]=>
array(2) {
["theTimeStamp"]=>
string(8) "�;e$��"
[0]=>
string(8) "�;e$��"
}
}

Expected:

array(1) {
[0]=>
array(2) {
["theTimeStamp"]=>
string(8) "2012-12-04 10:00:00+0100"
[0]=>
string(8) "2012-12-04 10:00:00+0100"
}
}

For timestamp, pdo returning the hexodecimal string. You can use below function to convert it back to date string.


function getDateStringFromHex($str) {
$date = unpack('H*', $str);
$time = hexdec($date[1]) / 1000;
$dateStr = date('Y-m-d H:i:s', $time);
return $dateStr;
}