Text Generator
Text Generator > Use Cases >Write a Python docstring

Write a Python docstring

An example of how to create a docstring for a given Python function. We specify the Python version, paste in the code, and then ask within a comment for a docstring, and give a characteristic beginning of a docstring (""").

Example input

  • # Python 3.7 def randomly_split_dataset(folder, filename, split_ratio=[0.8, 0.2]): df = pd.read_json(folder + filename, lines=True) train_name, test_name = "train.jsonl", "test.jsonl" df_train, df_test = train_test_split(df, test_size=split_ratio[1], random_state=42) df_train.to_json(folder + train_name, orient='records', lines=True) df_test.to_json(folder + test_name, orient='records', lines=True) randomly_split_dataset('finetune_data/', 'dataset.jsonl') # An elaborate, high quality docstring for the above function: """

Example real generated results

  • star

    The function takes an object that contains a dataset and splits it into two datasets. An example of such objects is: .. csv-table:: id fname desc 0:a file1 This is first one with no records. 1:b file2 Second record after this. Contains data in both files. Name Description 1a:c :-valid: 42 id: 2 It makes sure to pick either ``file3`` or some other empty string as well (if given) Also note how we use `filter` here - it's just another way from what you probably have seen before. Now let's say I want only half of my original training set! You can do so by making your script generate new

  • star

    Splits a dataset into two halves based on some input parameters; then generates another data file that can be compared to what was first generated. The returned output is an object with three keys corresponding respectively to: - Train/Test datasets (one of which should contain both training and testing sets). The names are either "trea_" or "test" depending upon whether they were created by splitting from existing ones. Both these files will have identical filenames excepting possibly different ending characters. -The splits themselves consist of: - one original dataframe named `origfile` having all columns filled in. These must match exactly up front as this function creates new instances. For instance:: >>> dfsplit_1_0() { 'data': [{"name": "Demo

  • star

    The method below is intended to create a dataset of 500K records that are randomized across files. To achieve this you will need either (1.) use numpy with ndarray() or an index into each file separately (2. do some sort of sorting in python if not available). If using Numpy and having trouble, you can try importing NumPy's shuffle module instead. Usage example: >>> import pandas as pa ... >>> rdata_sorted_df[['Test']] ..... Test Mean std dev min max 0 A 6 1 5 2 10 .....

  • star

    The following code is from our recent article entitled "Python’s Random Splitting Algorithm (RSAT-1), a Versatile and Efficient Train/test Splitter". This example demonstrates how to use RSAT-2 as your own version of RSAT. Sample Code import pandas as pa

try it yourself: Text Generator Playground - Write a Python docstring