CSIS 440 Assignments


Generally, work will be submitted electronically by:

New assignments are added as the semester progresses. Check back often.

Weekly
Due 1/21

(50 Points) Assignment 1: Short Presentation & Discussion
You will give a short 3 to 5 minute presentation without slides to the class on topic of how human minds related to machines. You may have a note card or piece of paper with you as you speak. The goal of the presentation is to persuade the audience to your viewpoint on the issue. Pick a specific question within the topic like "Can machines have souls?" or "Is the human mind/brain a machine?" or "Can AI completely surpass human intelligence and creativity?" or whatever question that is related to the topic, and make an argument for or against it. Be prepared for questions from the audience.

You will not be graded based on the stance of your argument (unless is it inappropriate or off topic) but on how well prepared you are and whether you can fit it into 3 to 5 minutes. You will need to practice your talk, do not simply improvise. Be sure to clearly state the question you are attempting to answer. What I am looking for is a good argument i.e. one that is logical, based on clear principles, and is supported via evidence and research. That is, I'm not going to penalize you for having the "wrong" viewpoint or one that is controversial. It is OK to articulate someone else's argument, e.g. Searle, just be sure to understand their argument well and give them credit. Upload your speaker notes to Canvas. In your notes I would like to see an outline of your argument and references to papers, talks, Scripture, etc that informed your argument.

I have a Google Drive folder with some papers and references to get you started. Also, it is OK to use LLMs (Chat-GPT, Claude, etc) for research - ask them questions about different ideas you have and existing arguments! If you are going to use something it said, just make sure to get to the origin of the argument and cite it in your notes. For example, you may have questions about Searle's Chinese room argument and you talk to Chat-GPT about it, just make sure to cite both Searle and Chat-GPT if you use it in your presentation. Also consider giving the outline of your presentation to an LLM and asking it for critique, feedback, or even what question it thinks someone would ask in response to your representation.

Due 2/6

(50 Points) Assignment 2: A* Search

Implement A* search in Python to find the shortest path in a state space of cities connected by roads. The starter repo contains a data file in GraphML format. This data file defines a graph of cities (nodes) and roads between cities (edges).

The starter code in the provided program uses NetworkX to represent the graph of cities and roads; you will need to use the NetworkX API to implement your solution. The repo also contains a Conda environment definition file you can use to create your Python environment. (e.g., by using Miniconda).

Some implementation notes: Your program should start by initializing the current node as the start node (the city of Portland, labeled portland in the graph) The graph already contains the step cost (i.e., actual highway mileage between cities) from each node n to its successors, stored as edge weights; you should use these to determine the path cost g(n) for each successor The graph already contains the straight line distance from each node to the goal node (the city of Pacific City, labeled pacificcity in the graph); you should use this as the heuristic cost h(n) for each successor If the current node is not the goal, your program should select the next node for expansion by considering the path cost and heuristic cost for each successor and minimize the evaluation function f(n) = g(n) + h(n) Once the shortest path to the goal is found, your program should print out the path (i.e., list of cities along the shortest path), along with the path cost. Print off the city names one per line like following:

 Portland
city1
city2
city3
...
The shortest path cost is: xxxx
Of course replace "city1" etc with the actual city names and "xxxx" with the actual cost. Round the cost to 4 decimal places.

Submit your code by pushing your main branch to GitLab.

Due 2/18

(50 Points) Assignment 3: Genetic Search Algorithm
You will implement a genetic search algorithm in Python to find a solution to the n-queens problem, constrained to n = 8.

The starter code in the provided program uses "digit strings" to represent board states; see the provided State class (in the corresponding state.py file) for details. Several aspects of the genetic algorithm are already implemented for you, including mating a pair of individuals, and applying a random mutation to the resulting successor state. There is also a static generator available for you to use to generate k individuals to form the initial population. Your task is to implement the high-level genetic algorithm in the genetic_search.py file (see the TODO comments). This includes implementing a fitness function and a goal test for the algorithm to use, as well as a utility method to print out the board state. Your program does not need to find all solutions; it should simply stop once the first solution is found. Once a solution is found, your program should print out some visual representation of the board state. For example, given a solution with a digit string of "53172864", the corresponding visual representation of the board state should be the following:

 8 | | | | | |Q| | |
7 | | | |Q| | | | |
6 | | | | | | |Q| |
5 |Q| | | | | | | |
4 | | | | | | | |Q|
3 | |Q| | | | | | |
2 | | | | |Q| | | |
1 | | |Q| | | | | |
a b c d e f g h

Submit your code by pushing your main branch to GitLab.

Due 2/27

(50 Points) Assignment 4: Tic Tac Toe
Implement the MINIMAX algorithm for Tic Tac Toe in Python. The starter repository has a game board class implemented for you to use. (Note that it was generated by an LLM so don't complain about its style :) ) The class is immutable which would be helpful in implementing the MINIMAX search.

The starter repository also has a player class. You will be responsible for implementing two methods in the player class: move and utility. Utility is where you will implement the MINIMAX algorithm to determine how good the current board state is the player. The utility is represented by 1, 0, or -1, if the state will lead to a win, tie, or loss respectively for the player. The move method will clause the player to make an optimal play for the current game state. Because the board is immutable, the move method will produce a new board based on the player's move.

Submit your code by pushing your main branch to GitLab.

Due 3/6

(50 Points) Assignment 5: Sudoku
Implement the AC-3 algorithm for Sudoku in Python. Again, the starter repository has a game board class implemented for you to use. Also again, I used an LLM to generated it, so the style is questionable. Each square on the board has a set of possibility numbers or a set value.

You will be implementing the AC-3 algorithm inside the solve_puzzle function in the player module (file). The function does not need to return anything, it just updates the given puzzle with the solution. If there is not solution to the puzzle or AC-3 cannot find the solution, just fill in as much of the solution as possible.

AC-3 cannot actually solve harder sudoku puzzles, only easy or mediums because binary (pairwise) consistency checks are not sufficient. For extra credit, add backtracking to AC-3. This will allow the algorithm to guess, and reverse the guess later if necessary. This should be powerful enough to solve any sudoku puzzle.

Submit your code by pushing your main branch to GitLab.

Due 3/18

(50 Points) Assignment 6: Wordle
Implement a Wordle player! In the starter repository there is stater code for a player and a game environment to test your player in. It also includes all of the office Wordle words (well at least I found them in the JS) that you can use to formulate your player's guesses.

You are free to make your agent work however you want like so be creative just don't cheat the game by inspecting the environment as it runs etc. However you are required to use conditional probability somewhere in your agent. For example, what is the probability the answer starts with s given that s is somewhere in the word.

To make this even more fun, the student with the best agent gets extra credit! I'll judge which one is best based on the average number of guess and failures each agent makes.

Submit your code by pushing your main branch to GitLab.

Due 4/3

(50 Points) Assignment 7: Classification & Regression
Use Scikit learn to apply linear and logistic regression to these two datasets: appendix (download here) and power. The goal is to predict Diagnosis and Zone 1 Power Consumption respectively in each dataset. Each dataset needs to be cleaned and dummy variables need to be created. Do your best with missing data, introduce an UNKNOWN category when possible for discrete data. For rows with missing continuous data, just skip those rows. For the timestamp in the power dataset, try creating dummy variables for season, month, day of week, hour of day etc. Including which variables impacts model performance? You need to determine which model is appropriate to apply to which dataset. Split the data into training and testing sets, fit the models, and report appropriate accuracy metrics in the README.md file.

Submit your code by pushing your main branch to GitLab.

Due 4/10

(50 Points) Assignment 8: Non-linear Models
Use Scikit learn to apply decision and regression trees, K-NN, and feed-forward neural networks on these two datasets from the previous assignment: appendix and power. The goal is to predict Diagnosis and Zone 1 Power Consumption respectively in each dataset. Give the data the same treatment as you did for the previous assignment. You need to determine which model is appropriate to apply to which dataset, but use as many as are applicable! Make sure to do model selection. Split the data into training and testing sets, fit the models, and report appropriate accuracy metrics in the README.md file. Be sure to use a validation set to pick optimal hyperparameter settings. How do these model compare to the linear models from the previous assignment?

Submit your code by pushing your main branch to GitLab.

Due 4/17

(75 Points) Assignment 9: Document Classification
Use Scikit learn to apply logistic regression, decision trees, K-NN, and a feed-forward neural network to do document classification on this dataset: Youtube Comments. Be sure to use a validation set to pick optimal hyperparameter settings. The goal is predict whether or not the comment is spam, the class column contains a 1 if the comment is spam or 0 otherwise. In order to represent each document use the bag of words and TF-IDF representations from Scikit learn. Report appropriate accuracy metrics on both training and testing splits for each model in the README.md file.

Submit your code by pushing your main branch to GitLab.

Due 4/24

(100 Points) Assignment 10: LLM Fine-Tuning
Fine-tune a LLM with Pytorch on a data set of your choice. The LLM must be available on Hugging Face. For datasets, take a look at Kaggle Datasets. You will want find data that naturally fits into the prompt-response paradigm. The expected data format is JSON structured in the following manner:

[
{
"instruction": "How can I write a hello world program in python?",
"output": "print("hello world")"
},
...
]
Create a set of questions to assess the model and feed the questions to both the original model and the fine-tuned version. On which questions did the fine-tuned model produce a better answer? How are you determining what is 'better'?

You will need to write a script to query to the original LLM - you should be able to combine parts of the two scripts I gave you to do this though!

You will need access to a GPU for this assignment, if you don't have one, let me know and I will create an account for you on the HPC.

Commit any code or changes you made to Gitlab. Submit a summary of your experience and evaluation of the models to Canvas.

Make sure to start early and experiment on a small data sample of your data for debugging etc, it can take more than a day to fine-tune the model!


This page was last modified on 2025-05-06 at 03:56:02.

Copyright © 2018–2025 George Fox University. All rights reserved.