项目作者: palantir

项目描述 :
Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
高级语言: Java
项目地址: git://github.com/palantir/hadoop-crypto.git
创建时间: 2016-06-28T16:18:49Z
项目社区:https://github.com/palantir/hadoop-crypto

开源协议:Apache License 2.0

下载



Autorelease

CircleCI Build Status
Maven Central

Seekable Crypto

Seekable Crypto is a Java library that provides the ability to seek within
SeekableInputs while decrypting the underlying contents along with some
utilities for storing and generating the keys used to encrypt/decrypt the data
streams. An implementation of the Hadoop FileSystem is also included that uses
the Seekable Crypto library to provide efficient and transparent client-side
encryption for Hadoop filesystems.

Supported Ciphers

Currently AES/CTR/NoPadding and AES/CBC/PKCS5Padding are supported.

Disclaimer Neither supported AES mode is authenticated.
Authentication should be performed by consumers of this library via an
external cryptographic mechanism such as Encrypt-then-MAC. Failure to
properly authenticate ciphertext breaks security in some scenarios where an
attacker can manipulate ciphertext inputs.

Programatic Example

Source for examples can be found here

  1. byte[] bytes = "0123456789".getBytes(StandardCharsets.UTF_8);
  2. // Store this key material for future decryption
  3. KeyMaterial keyMaterial = SeekableCipherFactory.generateKeyMaterial(AesCtrCipher.ALGORITHM);
  4. ByteArrayOutputStream os = new ByteArrayOutputStream(bytes.length);
  5. // Encrypt some bytes
  6. OutputStream encryptedStream = CryptoStreamFactory.encrypt(os, keyMaterial, AesCtrCipher.ALGORITHM);
  7. encryptedStream.write(bytes);
  8. encryptedStream.close();
  9. byte[] encryptedBytes = os.toByteArray();
  10. // Bytes written to stream are encrypted
  11. assertThat(encryptedBytes).isNotEqualTo(bytes);
  12. SeekableInput is = new InMemorySeekableDataInput(encryptedBytes);
  13. SeekableInput decryptedStream = CryptoStreamFactory.decrypt(is, keyMaterial, AesCtrCipher.ALGORITHM);
  14. // Seek to the last byte in the decrypted stream and verify its decrypted value
  15. byte[] readBytes = new byte[bytes.length];
  16. decryptedStream.seek(bytes.length - 1);
  17. decryptedStream.read(readBytes, 0, 1);
  18. assertThat(readBytes[0]).isEqualTo(bytes[bytes.length - 1]);
  19. // Seek to the beginning of the decrypted stream and verify it's equal to the raw bytes
  20. decryptedStream.seek(0);
  21. decryptedStream.read(readBytes, 0, bytes.length);
  22. assertThat(readBytes).isEqualTo(bytes);

Hadoop Crypto

Hadoop Crypto is a library for per-file client-side encryption in Hadoop
FileSystems such as HDFS or S3. It provides wrappers for the Hadoop FileSystem
API that transparently encrypt and decrypt the underlying streams. The
encryption algorithm uses Key
Encapsulation
: each file is
encrypted with a unique symmetric key, which is itself secured with a
public/private key pair and stored alongside the file.

Architecture

The EncryptedFileSystem wraps any FileSystem implementation and encrypts the
streams returned by open and close. These streams are encrypted/decrypted by a
unique per-file symmetric key which is then passed to the KeyStorageStrategy
which stores the key for future access. The provided storage strategy
implementation encrypts the symmetric key using a public/private key pair and
then stores the encrypted key on the FileSystem with the encrypted file.

Standalone Example

The hadoop-crypto-all.jar can be added to the classpath of any client and used
to wrap any concrete backing FileSystem. The scheme of the EncryptedFileSystem
is e[FS-scheme] where [FS-scheme] is any FileSystem that can be instantiated
statically using FileSystem#get (eg: efile). The FileSystem implementation,
public key, and private key must be configured in the core-site.xml as well.

Hadoop Cli

Add hadoop-crypto-all.jar to the classpath of the cli (ex: share/hadoop/common).

Generate public/private keys
  1. openssl genrsa -out rsa.key 2048
  2. # Public Key
  3. openssl rsa -in rsa.key -outform PEM -pubout 2>/dev/null | grep -v PUBLIC | tr -d '\r\n'
  4. # Private Key
  5. openssl pkcs8 -topk8 -inform pem -in rsa.key -outform pem -nocrypt | grep -v PRIVATE | tr -d '\r\n'
core-site.xml
  1. <configuration>
  2. <property>
  3. <name>fs.efile.impl</name> <!-- others: fs.es3a.impl or fs.ehdfs.impl -->
  4. <value>com.palantir.crypto2.hadoop.StandaloneEncryptedFileSystem</value>
  5. </property>
  6. <property>
  7. <name>fs.efs.key.public</name>
  8. <value>MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqXkSOcB2UpLrlG3scAHDavPnSucxOwRWG12woY5JerYlqyIm7xcNuyLQ/rLPxdlCGgOZOoPzKVXc/3pAeOdPM1LcXLNW8d7Uht3vo7a6SR/mXMiCTMn+9wOx40Bq0ofvx9K4RSpW2lKrlJNUJG+RP5lO7OhB5pveEBMn/8OR2yMLgS58rHQ0nrXXUHqbWiMI8k+eYK7aimexkQDhIXtbqmQ5tAXKyoSMDAyeuDNY8WsYaW15OCwGSIRClNAiwPEGLQCYJQi41IxwQxwN42jQm7fwoVSrN4lAfi5B8EHxFglAZcE8nUTdTnXCbUk9SPz8XXmK4hmK9X4L+2Av4ucNLwIDAQAB</value>
  9. </property>
  10. <property>
  11. <name>fs.efs.key.private</name>
  12. <value>MIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCpeRI5wHZSkuuUbexwAcNq8+dK5zE7BFYbXbChjkl6tiWrIibvFw27ItD+ss/F2UIaA5k6g/MpVdz/ekB4508zUtxcs1bx3tSG3e+jtrpJH+ZcyIJMyf73A7HjQGrSh+/H0rhFKlbaUquUk1Qkb5E/mU7s6EHmm94QEyf/w5HbIwuBLnysdDSetddQeptaIwjyT55grtqKZ7GRAOEhe1uqZDm0BcrKhIwMDJ64M1jxaxhpbXk4LAZIhEKU0CLA8QYtAJglCLjUjHBDHA3jaNCbt/ChVKs3iUB+LkHwQfEWCUBlwTydRN1OdcJtST1I/PxdeYriGYr1fgv7YC/i5w0vAgMBAAECggEASvSLhROEwbzNiRadLmT5Q4Kg19YtRgcC9pOXnbzK7wVE3835HmI55nzdpuj7UGxo+gyBZwoZMD0Tw8MUZOUZeH+7ixye5ddCdGwQo34cIl+DiaH9T20/4Yy2zuYc2QTanqyqZ5z0URejX9FRs9PMkC6EY+/NxetGaiGu3UZoalz7F/5wS8bCaKPkm3AjLvqXHL5KiSbPDPBQj4m+iFWLoWZL9FB1zyif+yBatU4cBCLHaTTgXroItEKcxTwFfyi2l059ItoP5E10djKHpMuPiPrTMS0FHAom3GZAYEFnjRgInR0sIotEwuSDObqcio1PdXRsi5Ul8MxfpXxLSuL+UQKBgQDcvmehBARNDksQJGzIyegKg10eLYdfXFCR+QDZeqJod/pCQ6gtW0aFYAoL0uXiMwQzSb6m7offmXH0JLLqOnjgcZlejHUDSTTWtNOYlGaO7OVgFcnG6/UnCE54eJcaw68auvPB9XW3gm5cfWSNpUI+6aJDBb6BKx8uNMoRreq9wwKBgQDEilhsCgUOIRkJfM5MYUzMT0gR8qt671q+lgTjBDwYvdoQ7BijG6Lbqbp9Xd4nODiw1t7e1Rexw+cuIeRs8NITU4f4Nfe25rRhZ+0n7g9OoCiRUoEsmd7cqDk6pubpw9hW1TKKLzTqExisGFy+bnUA8FFs2TbU9Xeb9kdm1GXgJQKBgAsN9f6YRubc+mFakaAUjGxKW9VxDkB2TQqiX6qEe7GjoILFBJ0Q3x06zAX/j8eeKm2vGb8eXuuRsaU6WUNlnjwPNFEJ06pQdjbyY05W0DQEJRCExtARbPuBbPyXfWm3twMtrZtfAYApJgG3vdtiFUk1Rgz5MqshT7RurFfqT8ElAoGAE2BEOVp/hxYSPtI0EGmjRZ0nUMWozDTesF1f2/Wl6xaEchikkSf/VUKVZRik9x7ez+hPDo7ZiCf1GaIzv926CDe69uhzJG/4JoY1ZjNdBPZbKYCFxZzh0MUw5yxfJXquUFkyY1cmE1GQpB6+vfNry4zlqiJ7+mC8yv5rqaKU7JUCgYBXPYpuQppR1EFj66LSrZ8ebXmt5TtwR839UkgEhLOBkO0cFP2BXVAMx9p0/MYLNIPk7vVpVtRCKYr6tBVdUWCin0obC5O+JzuhilQ0aH3xl5mbiasOvCNPjniaTViRt6zNlaq6RMS4x1LqYUyqc4LUrBbGMWJsdjYqVAi1Rq1FTw==</value>
  13. </property>
  14. </configuration>
Commands
  1. ./bin/hadoop dfs -put file.txt efile:/tmp/file.txt
  2. ./bin/hadoop dfs -ls efile:/tmp
  3. ./bin/hadoop dfs -cat efile:/tmp/file.txt

Programatic Example

Source for examples can be found here

Initialization

  1. // Get a local FileSystem
  2. FileSystem fs = FileSystem.get(new URI("file:///"), new Configuration());
  3. // Initialize EFS with random public/private key pair
  4. KeyPair pair = TestKeyPairs.generateKeyPair();
  5. KeyStorageStrategy keyStore = new FileKeyStorageStrategy(fs, pair);
  6. EncryptedFileSystem efs = new EncryptedFileSystem(fs, keyStore);

Writing data using EFS

  1. // Init data and local path to write to
  2. byte[] data = "test".getBytes(StandardCharsets.UTF_8);
  3. byte[] readData = new byte[data.length];
  4. Path path = new Path(folder.newFile().getAbsolutePath());
  5. // Write data out to the encrypted stream
  6. OutputStream eos = efs.create(path);
  7. eos.write(data);
  8. eos.close();
  9. // Reading through the decrypted stream produces the original bytes
  10. InputStream ein = efs.open(path);
  11. IOUtils.readFully(ein, readData);
  12. assertThat(data, is(readData));
  13. // Reading through the raw stream produces the encrypted bytes
  14. InputStream in = fs.open(path);
  15. IOUtils.readFully(in, readData);
  16. assertThat(data, is(not(readData)));
  17. // Wrapped symmetric key is stored next to the encrypted file
  18. assertTrue(fs.exists(new Path(path + FileKeyStorageStrategy.EXTENSION)));

Hadoop Configuration Properties

Key Value Default
fs.efs.cipher The cipher used to wrap the underlying streams. AES/CTR/NoPadding
fs.e[FS-scheme].impl Must be set to com.palantir.crypto2.hadoop.StandaloneEncryptedFileSystem
fs.efs.key.public Base64 encoded X509 public key
fs.efs.key.private Base64 encoded PKCS8 private key
fs.efs.key.algorithm Public/private key pair algorithm RSA

License

This repository is made available under the Apache 2.0 License.

FAQ

log.warn lines from CryptoStreamFactory

WARN: Unable to initialize cipher with OpenSSL, falling back to JCE implementation

‘Falling back to the JCE implementation’ results in slower cipher performance than native OpenSSL. Resolve this by installing a compatible OpenSSL and symlinking it to the correct location, /usr/lib/libcrypto.so. (OpenSSL 1.0 and 1.1 are currently supported)

Note: to support OpenSSL 1.1, we use releases from the Palantir fork of commons-crypto as support has been added to the mainline Apache repo, but no release made since 2016.

  1. Exception in thread "main" java.io.IOException: java.security.GeneralSecurityException: CryptoCipher {org.apache.commons.crypto.cipher.OpenSslCipher} is not available or transformation AES/CTR/NoPadding is not supported.
  2. at org.apache.commons.crypto.utils.Utils.getCipherInstance(Utils.java:130)
  3. at ApacheCommonsCryptoLoad.main(ApacheCommonsCryptoLoad.java:10)
  4. Caused by: java.security.GeneralSecurityException: CryptoCipher {org.apache.commons.crypto.cipher.OpenSslCipher} is not available or transformation AES/CTR/NoPadding is not supported.
  5. at org.apache.commons.crypto.cipher.CryptoCipherFactory.getCryptoCipher(CryptoCipherFactory.java:176)
  6. at org.apache.commons.crypto.utils.Utils.getCipherInstance(Utils.java:128)
  7. ... 1 more
  8. Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
  9. at org.apache.commons.crypto.utils.ReflectionUtils.newInstance(ReflectionUtils.java:90)
  10. at org.apache.commons.crypto.cipher.CryptoCipherFactory.getCryptoCipher(CryptoCipherFactory.java:160)
  11. ... 2 more
  12. Caused by: java.lang.reflect.InvocationTargetException
  13. at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  14. at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
  15. at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
  16. at java.base/java.lang.reflect.Constructor.newInstance(Unknown Source)
  17. at org.apache.commons.crypto.utils.ReflectionUtils.newInstance(ReflectionUtils.java:88)
  18. ... 3 more
  19. Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: EVP_CIPHER_CTX_cleanup
  20. at org.apache.commons.crypto.cipher.OpenSslCipher.<init>(OpenSslCipher.java:59)
  21. ... 8 more
  22. Caused by: java.lang.UnsatisfiedLinkError: EVP_CIPHER_CTX_cleanup
  23. at org.apache.commons.crypto.cipher.OpenSslNative.initIDs(Native Method)
  24. at org.apache.commons.crypto.cipher.OpenSsl.<clinit>(OpenSsl.java:95)
  25. at org.apache.commons.crypto.cipher.OpenSslCipher.<init>(OpenSslCipher.java:57)
  26. ... 8 more