How to read and write to a CSV File using Pandas

import pandas
help(pandas.read_csv)Help on function read_csv in module pandas.io.parsers:

read_csv(filepath_or_buffer:Union[str, pathlib.Path, IO[~AnyStr]], sep=',', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal:str='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, dialect=None, error_bad_lines=True, warn_bad_lines=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)
Read a comma-separated values (csv) file into DataFrame.

Also supports optionally iterating or breaking of the file
into chunks.

Additional help can be found in the online docs for
`IO Tools <https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html>`_.

Parameters
----------
filepath_or_buffer : str, path object or file-like object
Any valid string path is acceptable. The string could be a URL. Valid
URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is
expected. A local file could be: file://localhost/path/to/table.csv.

If you want to pass in a path object, pandas accepts any ``os.PathLike``.

By file-like object, we refer to objects with a ``read()`` method, such as
a file handler (e.g. via builtin ``open`` function) or ``StringIO``.
sep : str, default ','
Delimiter to use. If sep is None, the C engine cannot automatically detect
the separator, but the Python parsing engine can, meaning the latter will
be used and automatically detect the separator by Python's builtin sniffer
tool, ``csv.Sniffer``. In addition, separators longer than 1 character and
different from ``'\s+'`` will be interpreted as regular expressions and
will also force the use of the Python parsing engine. Note that regex
delimiters are prone to ignoring quoted data. Regex example: ``'\r\t'``.
delimiter : str, default ``None``
Alias for sep.
header : int, list of int, default 'infer'
Row number(s) to use as the column names, and the start of the
data. Default behavior is to infer the column names: if no names
are passed the behavior is identical to ``header=0`` and column
names are inferred from the first line of the file, if column
names are passed explicitly then the behavior is identical to
``header=None``. Explicitly pass ``header=0`` to be able to
replace existing names. The header can be a list of integers that
specify row locations for a multi-index on the columns
e.g. [0,1,3]. Intervening rows that are not specified will be
skipped (e.g. 2 in this example is skipped). Note that this
parameter ignores commented lines and empty lines if
``skip_blank_lines=True``, so ``header=0`` denotes the first line of
data rather than the first line of the file.
...
pd = pandas.read_csv("nativity_dataset.csv") 
display(pd)
print(pd)Image URL ... Labels 0 https://d3d00swyhr67nd.cloudfront.net/w1200h12... ... NaN 1 https://d3d00swyhr67nd.cloudfront.net/w1200h12... ... NaN 2 https://d3d00swyhr67nd.cloudfront.net/w1200h12... ... NaN 3 https://d3d00swyhr67nd.cloudfront.net/w1200h12... ... NaN 4 https://d3d00swyhr67nd.cloudfront.net/w944h944... ... NaN .. ... ... ... 215 https://upload.wikimedia.org/wikipedia/commons... ... NaN 216 https://preview.redd.it/4tc6qepayw521.jpg?widt... ... NaN 217 https://upload.wikimedia.org/wikipedia/commons... ... NaN 218 https://upload.wikimedia.org/wikipedia/commons... ... NaN 219 https://upload.wikimedia.org/wikipedia/commons... ... NaN [220 rows x 3 columns]
pd

What happened there?

pd = pandas.read_csv("nativity_dataset.csv", names=["Precise Image URL", "Precise Source URL", "Precise Labels"]) 
pd
pd = pandas.read_csv("nativity_dataset.csv", header=0, names=["Precise Image URL", "Precise Source URL", "Precise Labels"]) 
pd
del pd["Precise Labels"] 
pd
pd.to_csv("nativity_dataset_updated.csv")

RESOURCES

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store