Q & A

2021.2.23 Newline Characters in Text Files

Q: Windows/DOS uses Carriage Return and Line Feed ("\r\n") as a line ending, while Unix uses just Line Feed ("\n"). Why this example in Jupyter Notebook shows that the string returned by infile.read() only contains "\n". Is there something wrong?

A: Excellent Question! Let's do an experiment to investigate this phenomenon.
  1. Use Notepad to create a text file "test.txt"
    A
    B
    C
    Note that you should press ENTER after "C" so that the last line contains a line ending.
  2. "DIR test.txt" shows that the file size is 9 bytes.
  3. "od -t d1 name.txt" shows the ASCII code of each byte.
    "od -c name.txt" shows the nine characters.
  4. Now try to read the text file into Python:
    infile = open("test.txt", "r")
    s = infile.read()
    print(len(s), repr(s))
    infile.close()
  5. You see the string length is 6, and its content is "A\nB\nC\n".
  6. This must imply that, Python automatically perform a conversion when it is handling text files.
  7. Let's do another experiment. This time, we open the file as a binary file:
    infile = open("test.txt", "br")
    b = infile.read()
    print(len(b), repr(b))
    infile.close()
  8. This time the contents of b contain "\r\n", and the total size is 9 bytes.

2021.2.23 self

Q: What's the meaning of the "self" parameter in a class method? Is it used to specify this is a private member function?

A: The quick answer is "No", but we must clarify this (and related) concept with a couple of examples.
  1. The "self" in Python is quite different from "private" in C++. The "private" in C++ specifies that it is a private member, so it can only be accessed by member functions or friends of that class.
    #include <iostream>
    using std::cin;
    using std::cout;
    using std::endl;
    
    class Student {
    public:
        Student(int y) { year = y; }
        int getYear() { return year; }
    private:
        int year;
    };
    
    class Teacher {
    public:
        int calcAge(Student a) {
            int age;
            // age = 2021 - a.year;
            age = 2021 - a.getYear();
            return age;
        }
    };
    
    int main()
    {
        Student alice(1999);
        Student bob(2000);
        Teacher ted;
    
        cout << ted.calcAge(alice) << endl;
        cout << ted.calcAge(bob) << endl;
        return 0;
    }
  2. If we translate this C++ code to Python, you can see that both year and getYear() can be accessed.
    class Student:
        def __init__(self, y):
            self.year = y
        def getYear(self):
            return self.year
    
    class Teacher:
        def calcAge(self, a):
            age = 2021 - a.year
            # age = 2021 - a.getYear()
            return age
    
    def main():
        alice = Student(1999)
        bob = Student(2000)
        ted = Teacher()
    
        print( ted.calcAge(alice) )
        print( ted.calcAge(bob) )
    
    main()
  3. P.790 Python's self is the same as the this pointer in C++ or Java, but self is always explicit in both headers and bodies of Python methods to make attribute accesses more obvious.
  4. alice.getYear() is automatically translated to Student.getYear(alice)
    class Student:
        def __init__(self, y):
            self.year = y
        def getYear(self):
            return self.year
    
    class Teacher:
        def __init__(self, n):
            self.age = n
        def calcAgeDiff(self, a):
            studentAge = 2021 - a.year
            return self.age - studentAge
    
    def main():
        alice = Student(1999)
        ted = Teacher(28)
    
        print( alice.getYear(), Student.getYear(alice) )
        print( ted.calcAgeDiff(alice), Teacher.calcAgeDiff(ted, alice) )
    
    main()
  5. If you really want to define a private member in a Python class, prefix the member name with double underscore "__".
    class Student:
        def __init__(self, y):
            self.__year = y
        def getYear(self):
            return self.__year
    
    def main():
        alice = Student(1999)
    
        print( alice.getYear() )
        print( alice.__year )
    
    main()