authentication - Locally reading S3 files through Spark (or better: pyspark) -

- April 15, 2011

i want read s3 file (local) machine, through spark (pyspark, really). now, maintain getting authentication errors like

java.lang.illegalargumentexception: aws access key id , secret access key must specified username or password (respectively) of s3n url, or setting fs.s3n.awsaccesskeyid or fs.s3n.awssecretaccesskey properties (respectively).

i looked everywhere here , on web, tried many things, apparently s3 has been changing on lastly year or months, , methods failed one:

pyspark.sparkcontext().textfile("s3n://user:password@bucket/key")

(note s3n [s3 did not work]). now, don't want utilize url user , password because can appear in logs, , not sure how them ~/.aws/credentials file anyway.

so, how can read locally s3 through spark (or, better, pyspark) using aws credentials standard ~/.aws/credentials file (ideally, without copying credentials there yet configuration file)?

ps: tried os.environ["aws_access_key_id"] = … , os.environ["aws_secret_access_key"] = …, did not work.

pps: not sure "set fs.s3n.awsaccesskeyid or fs.s3n.awssecretaccesskey properties" (google did not come anything). however, did seek many ways of setting these: sparkcontext.setsystemproperty(), sc.setlocalproperty(), , conf = sparkconf(); conf.set(…); conf.set(…); sc = sparkcontext(conf=conf). nil worked.

yes, have utilize s3n instead of s3. s3 weird abuse of s3 benefits of unclear me.

you can pass credentials sc.hadoopfile or sc.newapihadoopfile calls:

rdd = sc.hadoopfile('s3n://my_bucket/my_file', conf = {   'fs.s3n.awsaccesskeyid': '...',   'fs.s3n.awssecretaccesskey': '...', })

authentication amazon-s3 apache-spark credentials pyspark

Search This Blog

Five

authentication - Locally reading S3 files through Spark (or better: pyspark) -

Comments

Post a Comment

Popular posts from this blog

java - How to set log4j.defaultInitOverride property to false in jboss server 6 -

c - GStreamer 1.0 1.4.5 RTSP Example Server sends 503 Service unavailable -

Using ajax with sonata admin list view pagination -