Click here to Skip to main content
14,239,615 members
Rate this:
Please Sign up or sign in to vote.
See more:
I'm blocking on a Python exercise that I'm trying to do!
Write a script to calculate the cooccurrences of the "text-b.txt"
file encoded in UTF-8. This file is the first argument passed to your script.
The second argument is the length of co-occurrence that can range from 2 to n tokens.
The third argument is the frequency of co-occurrence that can range from 1 to n.
The last two arguments are the length of the first and last token of co-occurrence.

What I have tried:

#I did not understand it well but this is my first attempt:
import sys, re

text = open (sys.argv [1], 'r', encoding = "utf-8")
output = open ("res.txt", "w", encoding = "utf-8")
dic = {}
long = int (sys.argv [2])
freq = int (sys.argv [3])
for i in text:
a = re.split ("\ w", i.lower ())
l = zip (a [i:] for i in range (long))
for j in l:
dic [j] = dic.get (j, 0) 1
for k in sorted (dic):
if dic.get (k) == freq:
output.write ("". join (K) "" str (dic.get (K)))
Posted
Comments
OriginalGriff 11-Jul-19 14:05pm
   
And?
What does it do that your didn't expect, or not do that you did?
What have you tried to find out why?
Where are you stuck?
What help do you need?
ZurdoDev 12-Jul-19 12:46pm
   
What is your question?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100