Let's suppose you want to create a task that loads a Pipe Delimited File into a database on a daily basis. The task is to run on Mac OS and you are using Snowflake for the database. You can do it with SDTK as follows:
1) Download SDTK. For this tutorial, we're going to use the latest version of the Python version as of this writing (0.1.2) which is here: https://sourceforge.net/projects/simple-data-toolkit/files/0.1.3/stc.py/download
2) Create a new text file called ConvertPSV.sh
3) Open the file with a text editor (like Text Edit)
Enter the following text:
#!/bin/sh
python stc.py clients.psv clients.sql createorreplace clients
(For more info on STC and SDTK see the following URL: https://www.vis-software.com/#sdtk)
export SNOWSQL_PWD=password
snowsql -a myorganization-myaccount -u jsmith -f clients.sql -d database -s public -o quiet=true -o friendly=false
(For more info on snowsql see the following URL: https://docs.snowflake.com/en/user-guide/snowsql-use.html)
4) Go to the Terminal.
5) Edit the list of cron jobs with the following command:
nano crontab -e
6) Add the following to a new line:
0 8 * * * cd ~ && ConvertPSV.sh
7) Press CTRL+O and CTRL+X to save and exit nano.
from sdtk import com_sdtk_api_ChatGPTAPI import os import fnmatch import re texData = "" texBegin = "\n\\begin{document}\n" texEnd = "\n\\end{document}" header = "" footer = "" foundBody = False # Get a list of all .png files in the current directory png_files = [file for file in os.listdir('.') if fnmatch.fnmatch(file, '*.png')] def result(text): global texData global header global footer global foundBody text = text.replace("```latex", "").replace("```", "") body_match = re.search(r'\\begin{document}(.*?)\\end{document}', text, re.DOTALL) if body_match: body_content = body_match.group(1).strip() if len(texData) > 0: texData = texData + "\n" + "\\newpage" + "\n" else: header_match = re.search(r'^(.*?)\\begin{document}', text, re.DOTALL) header = header_match.group(1).strip() footer_match = re.search(r'\\end{document}(.*)$', text, re.DOTALL) footer = footer_match.group(1).strip() texData = texData + body_content foundBody = True else: foundBody = False print(text) # Loop over each .png file for file_name in png_files: foundBody = False while foundBody == False: com_sdtk_api_ChatGPTAPI.instance().query(result, "Can you generate a LaTeX document to represent the text in this image and format it correctly and return only the LaTeX code?", file_name) with open("output.tex", "w+") as file: # Write the string to the file file.write(header + texBegin + texData + texEnd + footer)Simple Data Toolkit - Tutorial - Git API - PythonSimple Data Toolkit provides an unofficial API for reading files, commits, branches, and repos from the Git API. (At the time of this writing, the release of this is pending for complete support, but it is coming soon) To retrieve all repos a user has using Simple Data Toolkit, we can do the following:
from sdtk import com_sdtk_api_GitAPI def printer(data, reader): print(reader.toArrayOfNativeMaps(None)) com_sdtk_api_GitAPI.reposAPI().retrieveData({"owner": "Vis-LLC"}, printer)To retrieve all branches a repo has using Simple Data Toolkit, we can do the following:
from sdtk import com_sdtk_api_GitAPI def printer(data, reader): print(reader.toArrayOfNativeMaps(None)) com_sdtk_api_GitAPI.branchesAPI().retrieveData({"owner": "Vis-LLC", "repo": "Simple-Data-Toolkit"}, printer)To retrieve all the files in a branch using Simple Data Toolkit, we can do the following:
from sdtk import com_sdtk_api_GitAPI def printer(data, reader): print(reader.toArrayOfNativeMaps(None)) com_sdtk_api_GitAPI.filesAPI().retrieveData({"owner": "Vis-LLC", "repo": "Simple-Data-Toolkit", "branch": "main"}, printer)To retrieve the data in a file using Simple Data Toolkit, we can do the following:
from sdtk import com_sdtk_api_GitAPI def printerData(data, reader): print(data) com_sdtk_api_GitAPI.retrieveAPI().retrieveData({"owner": "Vis-LLC", "repo": "Simple-Data-Toolkit-UI", "branch": "main", "path": "index.html"}, printerData)We can also login using a personal access token (https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens)
from sdtk import com_sdtk_api_GitAPI def printerData(data, reader): print(data) com_sdtk_api_GitAPI.instance().setKey("Personal Access Token Here").retrieveAPI().retrieveData({"owner": "Vis-LLC", "repo": "Simple-Data-Toolkit-UI", "branch": "main", "path": "index.html"}, printerData)Simple Data Toolkit - Tutorial - Ortingo API - PythonAt the time of this writing, Ortingo does not have an official API. Fortunately, Simple Data Toolkit provides an unofficial API for reading posts (at the time of this writing, the release of this is pending for complete support, but it is coming soon) To retrieve all posts for a given user in Python, using Simple Data Toolkit, we can do the following:
from sdtk import com_sdtk_api_OrtingoAPI def printer(data, reader): print(reader.toArrayOfNativeMaps(None)) com_sdtk_api_OrtingoAPI.postsAPI().retrieveData({"owner": "60CQ59FN46SVQFXJ"}, printer)Let's suppose we want only a list of titles for a given user, we can do this instead:
from sdtk import com_sdtk_api_OrtingoAPI def printer(data, reader): print(reader.filterColumnsOnly(["title"]).toArrayOfNativeMaps(None)) com_sdtk_api_OrtingoAPI.postsAPI().retrieveData({"owner": "60CQ59FN46SVQFXJ"}, printer)We can also pull suggested content from Ortingo with the following, where the topics we are searching on are provided with the query parameter (in this case it's value is data):
from sdtk import com_sdtk_api_OrtingoAPI def printerUrls(data, reader): print(reader.filterColumnsOnly(["url"]).toArrayOfNativeMaps(None)) com_sdtk_api_OrtingoAPI.suggestionsAPI().retrieveData({"query": "data"}, printerUrls)And finally, we can also pull comments attached to a post in Ortingo with the following, where the user is myself and the post is a test post I created:
from sdtk import com_sdtk_api_OrtingoAPI def printerComments(data, reader): print(reader.filterColumnsOnly(["commentDate", "post"]).toArrayOfNativeMaps(None)) com_sdtk_api_OrtingoAPI.commentsAPI().retrieveData({"owner": "60CQ59FN46SVQFXJ", "id": "test"}, printerComments)The columns supported at the time of this writing are: - id - owner - title - subtitle - post - url For comments the following columns are supported: - id - owner - commentDate - replyTo - postSimple Data Toolkit - Python - Loading Data and Text Through ChatGPTSimple Data Toolkit provides an API for passing data and text to ChatGPT and extracting related data or text. (At the time of this writing, the release of this is pending for complete support, but it is coming soon) Below is an example which uses an embedded data of users and orders, plus text loaded in from a file.
from sdtk import com_sdtk_api_ChatGPTAPI, com_sdtk_table_ArrayOfMapsReader #Set def callbackData(reader): print(reader.toArrayOfNativeMaps(None)) def callbackText(data): print(data) users = [ {"user_id": 1, "first_name": "Sally", "last_name": "Franklin"}, {"user_id": 2, "first_name": "Lucas", "last_name": "Franklin"}, {"user_id": 3, "first_name": "Joe", "last_name": "Romeo"}, {"user_id": 4, "first_name": "Julie", "last_name": "Romeo"}, {"user_id": 5, "first_name": "Lucia", "last_name": "Templeton"}, ] orders = [ {"order_id": 1, "user_id": 3, "item_desc": "Book 1", "item_quantity": 2, "date": "2024-01-01"}, {"order_id": 2, "user_id": 3, "item_desc": "Book 2", "item_quantity": 1, "date": "2024-02-01"}, {"order_id": 3, "user_id": 3, "item_desc": "Book 3", "item_quantity": 3, "date": "2024-03-01"}, {"order_id": 4, "user_id": 3, "item_desc": "Book 4", "item_quantity": 0, "date": "2024-04-01"}, {"order_id": 5, "user_id": 3, "item_desc": "Book 5", "item_quantity": 4, "date": "2024-05-01"} ] com_sdtk_api_ChatGPTAPI.queryAsReaderWithDataAPI().execute("What can we determine about the buying habits of all of our users?", None, {"Users": com_sdtk_table_ArrayOfMapsReader.readWholeArray(users), "Orders": com_sdtk_table_ArrayOfMapsReader.readWholeArray(orders)}, callbackData) com_sdtk_api_ChatGPTAPI.queryWithDataAPI().execute("What can we determine about the buying habits of all of our users? As a narrative.", None, {"Users": com_sdtk_table_ArrayOfMapsReader.readWholeArray(users), "Orders": com_sdtk_table_ArrayOfMapsReader.readWholeArray(orders)}, callbackText) content = "" with open('sample.html', 'r') as content_file: content = content_file.read() com_sdtk_api_ChatGPTAPI.queryWithDataAPI().execute("We also have this document in HTML format on recent trends.\n" + content + "\n\nWhat can we determine about the buying habits of all of our users? As a narrative.", None, {"Users": com_sdtk_table_ArrayOfMapsReader.readWholeArray(users), "Orders": com_sdtk_table_ArrayOfMapsReader.readWholeArray(orders)}, callbackText)Below is an example which queries data from ChatGPT in a table format using a DataTableReader from SDTK.
from sdtk import com_sdtk_api_ChatGPTAPI def callback(data, reader): print(reader.toArrayOfNativeMaps(None)) com_sdtk_api_ChatGPTAPI.queryAsReaderAPI().retrieveData({ "query": "List all cities in the USA with known population." }, callback)
Connect differently.
Ortingo is a platform that makes journalism easier and information more accessible. Publish from a wide spectrum of various topics and connect with your audience with new ways of writing articles. Be part of a wealth of new information, through Ortingo.
Ready to connect differently?
Learn more about Ortingo
Any thoughts on Franklin's post?
To comment or reply, you need an Ortingo account.
Sign in or sign upHere's what Ortingoers think of Franklin's post.
There are no comments on this post.