Skip to main content

How to Open a File in Python

Tags

How to Open a File in Python

If you have ever wanted to know how to open a file using Python, but were unsure of the syntax, or exactly how Python works with files, then this tutorial is aimed at you.  This tutorial aims to walk you through the basics of how to use Python to open a file using two different real world scenario's.  The first scenario I walk through how to use Python to open a file from the command line using the file as a command line argument passed in from STDIN.  The second scenario I use Python to open a file that is referenced from a file path anywhere in your program.  Both scenario's use the same core logic to handle file I/O, the approach in how the file is passed into the program is basic difference.  At the end of this tutorial you should know a lot more on how to use Python to open files and and how to read content from those files line by line.

🐍  Python is an amazing programming language that can used for testing, automation, data science, operating system scripting, and fast utility work.  Opening files is essential for each of these engineering tasks.  Python utilizes the open Standard Library Function to process file input just like fopen does in C or C++, by taking the first argument as a string based file reference, and the second argument as the read mode for how the file is to be opened.  Let's take a look at the two examples below to gain a better understanding of how this works.

NOTE: This tutorial assumes that you are at least using Python 2.7 or greater.  The code samples in this tutorial will cover Python 2.7.10 and Python 3.6.1 and will be tested on Ubuntu (16.04) Linux and macOS High Sierra 10.13.2.  I have not tested the code in a Window's environment, but if you are using a Linux subsystem on Windows with Python available, the results should be fairly similar, but cannot be guaranteed.

 

🐍  Open a File with STDIN

STDIN Read

In the example below I created a small class called File to process an argument input from the command line and read each line into a list property called file_lines.  Creating a class for this specific functionality may be overkill, but it allows you, the software architect, the flexibility to plug this class utility into your program wherever it makes since. 

Let's take a look at how reading the file works.  First, in the main function an instance of the file object is created calling the File constructor.  In the file constructor three private properties are created, and one public, to hold each line of the file.  The first call in the constructor is to the parse_stdin method.  The parse_stdin method performs one of three tasks; gathering the argument input that references the filename of the file to be opened, gathering the argument input as stdin to the file object, or notifying the user if there is an error referencing the file.  

#
# Parse STDIN if any was presented when the program was run
#
def parse_stdin(self):
	# Use sys.argv or sys.stdin to get the file input from the command line
    if len(sys.argv) > 1 and sys.argv[1] is not None:
        # Recognize that sys.argv has a file argument in it
        self.__file_input_name = sys.argv[1]
    elif not sys.stdin.isatty():
        # isatty() makes sure stdin has data http://man7.org/linux/man-pages/man3/isatty.3.html
        # Recognize that there is a file object in sys.stdin
        self.__file_input_stdin = sys.stdin
    else:
        # Display the error to the console
        print("OOPs! There was an issue locating your input file")
        exit("Please provide a valid filename or file as a CLI argument")

Next, after gathering successful input from a command line argument it is time to open the file and read the contents.  In this method a try block is used to make sure no I/O exceptions are thrown, and if so, they are caught and the user is notified of this error.  Inside the try block a conditional statement is created to check if STDIN has a file object set, if not the file is opened by the name, if there is an object set to STDIN the file object is read to a Python object.

After the file' contents are read to a Python object the entire file is then set to a list broken up by newline characters.  This makes sure that each line of the file can be read line by line.

#
# The the contents of the file that was presented
#
def read_content_of_file(self):
    # -------- Attempt to open the file and read contents --------
 
    # Open file, read contents into a list to be parsed
	try:
        # Open the file and read the contents into a
		if self.__file_input_stdin is None:
			with open(self.__file_input_name, 'r') as file_obj:
				self.__complete_content = file_obj.read()
		else:
			self.__complete_content = self.__file_input_stdin.read()
 
		# Explode or split the string contents to a list
		self.file_lines = self.__complete_content.split("\n")
 
	except ValueError:
		# Write the raised error
		error_raised = "Error loading file: " + str(sys.exc_info()[0])
		# Display the error to the console
		print(error_raised)
		exit("This program needs an input file to continue. Exiting...")

Let's take a look at the entire routine recipe from start to finish:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# $ python main.py text_file.txt
# $ ./ main.py text_file.txt
#
from __future__ import print_function
import os, sys
 
class File():
 
	#
	# Constructor
	#
	def __init__(self):
		self.file_lines = []
		self.__file_input_name = ""
		self.__complete_content = ""
		self.__file_input_stdin = None
 
		# Read STDIN and the contents of the file
		self.parse_stdin()
 
		if self.__file_input_name is not "" or self.__file_input_stdin is not None:
			self.read_content_of_file()
		else:
			exit("Something went wrong parsing your input file.  Exiting...")
 
		if len(self.file_lines) == 0:
			exit("Nothing was read in to memory from the file.  Exiting...")
 
	#
	# Parse STDIN if any was presented when the program was run
	#
	def parse_stdin(self):
		# Use sys.argv or sys.stdin to get the file input from the command line
	    if len(sys.argv) > 1 and sys.argv[1] is not None:
	        # Recognize that sys.argv has a file argument in it
	        self.__file_input_name = sys.argv[1]
	    elif not sys.stdin.isatty():
	        # isatty() makes sure stdin has data http://man7.org/linux/man-pages/man3/isatty.3.html
	        # Recognize that there is a file object in sys.stdin
	        self.__file_input_stdin = sys.stdin
	    else:
	        # Display the error to the console
	        print("OOPs! There was an issue locating your input file")
	        exit("Please provide a valid filename or file as a CLI argument")
 
	#
	# The the contents of the file that was presented
	#
	def read_content_of_file(self):
        # -------- Attempt to open the file and read contents --------
 
        # Open file, read contents into a list to be parsed
		try:
            # Open the file and read the contents into a
			if self.__file_input_stdin is None:
				with open(self.__file_input_name, 'r') as file_obj:
					self.__complete_content = file_obj.read()
			else:
				self.__complete_content = self.__file_input_stdin.read()
 
			# Explode or split the string contents to a list
			self.file_lines = self.__complete_content.split("\n")
 
		except ValueError:
			# Write the raised error
			error_raised = "Error loading file: " + str(sys.exc_info()[0])
			# Display the error to the console
			print(error_raised)
			exit("This program needs an input file to continue. Exiting...")
 
 
 
# Main function
def main():
	# If there is a file expected from STDIN, create the file object
	# and the file object will attempt to read the contents of STDIN
	# in the constructor.
	file = File()
 
	for line in file.file_lines:
		print("Line for file: " + line)
 
# Execution of the main function
if __name__ == "__main__":
	main()

 

🐍  Open a File with a File Reference

Open File Anywhere

Now let's take a look at just using the Standard Library open function to open any file from within your program.  The difference between this example and the previous is that a string based file reference is used to open a file in your program instead of a command line argument being passed in at execution time.  The example below would be a more realistic example from an application standpoint where as the previous example was more of a testing or automation example.

Looking at the code below it is a lot more straight forward.  First the log_file variable is set to hold a reference to the file path relative to the program.  The complete_content variable is set as a blank string to be a placeholder for the raw contents of the file.  Now, same as before, we use a try block to catch any I/O exceptions and use the open Standard Library function, with a mode of read, to open the file and set the stream contents to the Python object, file_obj.  Once the file_obj is set, the contents of that object are read into the complete_content string and that string is then split into a list based upon newline characters and set to content_list.  The last step is to read each line in a for loop line by line.

from __future__ import print_function
import os, sys
 
 
# Main function
def main():
 
	log_file = 'logs/log'
	complete_content = ''
	content_list = []
	try:
		with open(log_file, 'r') as file_obj:
			complete_content = file_obj.read()
 
	except ValueError:
		# Write the raised error
		error_raised = "Error loading file: " + str(sys.exc_info()[0])
		# Display the error to the console
		print(error_raised)
		exit("This program needs an input file to continue. Exiting...")
 
	# Explode or split the string contents to a list
	content_list = complete_content.split("\n")
 
	for line in content_list:
		print("Line for file: " + line)
 
	exit("Exiting...")
 
# Execution of the main function
if __name__ == "__main__":
	main()

In Summary ⌛️

Reading files in Python can be very useful for automation, data science, testing, and log parsing.  It is a crucial language feature that you need to know if you are working with Python.  After reading the above tutorial I how you now know a bit more on how Python works with files and how simplistic it can be.

Where to go next? You can find all of the code from this post up on my Github Repo here if you want to take a look of try it out on your own.  Please let me know if you have any questions, comments or concerns on any of the examples I ran through.  As always, thank you for reading!

Comments

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.