Presto access is represented by many Python libraries among those are Dropbox/PyHive, prestosql/presto-python-client, prestodb/presto-python-client, and easydatawarehousing/prestoclient. Mostly of libraries use Python DB-API interface to access Presto which uniforms commands.

Python Access to Presto Cluster with Presto Client article describes PrestoSQL client library usage.

Dropbox/PyHive library is universal one as it can be used to access Hive or Presto. The sample is run with Python 3 in Windows.

1. Install PyHive library

Linux.

sudo pip3 install 'pyhive[presto]'

Windows. Run as administrator.

pip install 'pyhive[presto]'

It installs only Presto interface.

2. Include requested libraries

  • Access to Presto cluster without password.

    from pyhive import presto
    
  • Presto cluster is secured by password.

    from pyhive import presto
    from requests.auth import HTTPBasicAuth
    

3. Establish connection

  • Access to Presto cluster without password.

    conn = presto.connect(host='localhost',
                            port=8080,
                            catalog='system',
                            schema='runtime')
    
  • Presto cluster is secured by password but skip SSL verification. This case might be used during development stage.

    conn = presto.connect(host='localhost',
                          port=443,
                          protocol='https',
                          catalog='system',
                          schema='runtime',
                          requests_kwargs={'auth': HTTPBasicAuth('<user name>', '<password>'),
                                           'verify':False})
    
  • Presto cluster is secured by password.

    Option #1. Follow instructions in Convert Java Keystore to PEM File Format article to create presto.crt file. The file contains Presto SSL public certificate converted from Java keystore file.

    Option #2. Extract presto.crt certificate from Internet Browser. Follow Export TLS/SSL Server Certificate from Internet Browser article.

    conn = presto.connect(host='localhost',
                          port=443,
                          protocol='https',
                          catalog='system',
                          schema='runtime',
                          requests_kwargs={'auth': HTTPBasicAuth('<user name>', '<password>'),
                                           'verify':'presto.crt'})
    

4. Create cursor

cur = conn.cursor()

5. Retrieve data

cur.execute('SELECT * FROM nodes')
for row in cur.fetchall():
    print(row)

6. Improvements

To disable insecure warnings during https requests if verify=False, add the code in import section.

import urllib3
urllib3.disable_warnings()

7. Troubleshooting

In case of getting ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777) error, check your certificate expiration date. The date has to be valid.

Resources


Comments

comments powered by Disqus