import pandas as pd
import numpy as np
import seaborn as sns
import requests as r
This post series was inspired by the Korolev Job And Two Smoking Barrels where upon accepting the quest I realized how terrible the rewards were in comparison. For those that don’t play the game a little background will be necessary. The Cycle: Frontier is an Evacuation Shooter game and there are three main Corporations - or Organizations, if you prefer - who act as quest and reward givers in the game. There are two kinds of these: Campaign and Jobs. Whereas the campaign quests will push you along to different areas of the world, instead the Jobs function as a way to collect scrips and excuse to send players to the planet.
There are three kinds of quests: 1. Collect Stuff. 2. Deposit Stuff. 3. Kill Stuff: including players.
For Deposit Jobs, you carry the requested items to a Dead Drop and then deposit the items in question. For this one, it requires you deposit a gun you purchase from the shop. Now, you could find this weapon or loot it from other people but those are not garunteed at all. The Kmark - which is cash, basically - reward is $19,000 and the gun it wants you to deposit is $22,000 so there is no reason to take this job unless you already have this gun. Anyways, lets get to the fun part.
Scraping and Cleaning Cycle Data
So, we’ll start with the normal imports for doing data in Python.
Thankfully, there is an official wiki for the game which is maintained by both the Developers and the Community together. We’re going to pull the data from there as that should be the most up to date data. Like most online data scraping, there is some try-fail loops to getting what you’re after from the webpage. Since there are three organizations, there are three tables with jobs we’d like to pull from the website. After some trial and error I found that using match="Name"
was perfect for pulling the tables out of the webpage.
# game taken down
# url = "https://thecyclefrontier.wiki/wiki/Jobs"
= "https://archive.ph/6pFGb"
url
= pd.read_html(url, match="Name",
site = {
converters "Name": str,
"Description": str,
"Unlocked": int,
"Tasks": str,
"Rewards": str})
You may notice the addition of converters
argument above which is a really useful feature I didn’t know previously; basically, if you know the column names coming in then you can tell Pandas what data type you want so you don’t have to convert later. So, what does the data look like?
0].head(8) site[
Name | Description | Unlock Level | Difficulty | Tasks | Rewards | |
---|---|---|---|---|---|---|
0 | New Mining Tools | We are producing new Mining Tools for new Pros... | 4.0 | Easy | Collect: 2 Hydraulic Piston 10 Hardened Metals | 3800 K-Marks 1 Korolev Scrip 15 Korolev R... |
1 | 3800 | K-Marks | NaN | NaN | NaN | NaN |
2 | 1 | Korolev Scrip | NaN | NaN | NaN | NaN |
3 | 15 | Korolev Reputation | NaN | NaN | NaN | NaN |
4 | Explosive Excavation | One of our mines collapsed with valuable equip... | 7.0 | Medium | Collect: 4 Derelict Explosives | 11000 K-Marks 8 Korolev Scrip 52 Korolev ... |
5 | 11000 | K-Marks | NaN | NaN | NaN | NaN |
6 | 8 | Korolev Scrip | NaN | NaN | NaN | NaN |
7 | 52 | Korolev Reputation | NaN | NaN | NaN | NaN |
That is not really what we were hoping would come in. Checking the actual site, this is caused by a table which exists inside one of the row cells. But - like in the previous post - this can still be used after some work. So, lets get to work!
Fixing the Job Rewards
We dont need most of the columns for what we’re going to be accomplishing so we’re going to pull them out. I’m going to do a copy as pandas sometimes doesn’t play so nicely with updates. What I’ve found is that since Pandas uses pointers underneath, sometimes when doing updates to slices I don’t always get what I expect. So, we’ll splice and copy to only get the data we care about in its own independent dataframe.
= site[0][["Name", "Description", "Difficulty"]].copy()
rewardsSubset rewardsSubset
Name | Description | Difficulty | |
---|---|---|---|
0 | New Mining Tools | We are producing new Mining Tools for new Pros... | Easy |
1 | 3800 | K-Marks | NaN |
2 | 1 | Korolev Scrip | NaN |
3 | 15 | Korolev Reputation | NaN |
4 | Explosive Excavation | One of our mines collapsed with valuable equip... | Medium |
... | ... | ... | ... |
183 | 470 | Korolev Reputation | NaN |
184 | No Expiry Date | There you are, finally! There's been an accide... | Medium |
185 | 6300 | K-Marks | NaN |
186 | 9 | Korolev Scrip | NaN |
187 | 62 | Korolev Reputation | NaN |
188 rows × 3 columns
These column names are not useful so we’re going to correct those so they make sense with the project. We’re going to call the final column Job
which will make more sense as the work gets done.
= ["Units", "Rewards", "Job"]
rewardsSubset.columns rewardsSubset.head()
Units | Rewards | Job | |
---|---|---|---|
0 | New Mining Tools | We are producing new Mining Tools for new Pros... | Easy |
1 | 3800 | K-Marks | NaN |
2 | 1 | Korolev Scrip | NaN |
3 | 15 | Korolev Reputation | NaN |
4 | Explosive Excavation | One of our mines collapsed with valuable equip... | Medium |
So, looking at above we can see that we have extra data in the Job Column
which is no longer appropriate. We’re going to simply fill that column with a null value: np.NaN
= np.NaN
rewardsSubset.Job 12) rewardsSubset.head(
Units | Rewards | Job | |
---|---|---|---|
0 | New Mining Tools | We are producing new Mining Tools for new Pros... | NaN |
1 | 3800 | K-Marks | NaN |
2 | 1 | Korolev Scrip | NaN |
3 | 15 | Korolev Reputation | NaN |
4 | Explosive Excavation | One of our mines collapsed with valuable equip... | NaN |
5 | 11000 | K-Marks | NaN |
6 | 8 | Korolev Scrip | NaN |
7 | 52 | Korolev Reputation | NaN |
8 | Mining Bot | Our engineers have designed an autonomous mini... | NaN |
9 | 6900 | K-Marks | NaN |
10 | 9 | Korolev Scrip | NaN |
11 | 62 | Korolev Reputation | NaN |
So, now for the hard part: getting the Job Title into the Job Column
. Looking at the data above, we can see that the Job Title is always stored in a multiple of four. We can confirm this by simply dividing the total number of columns by 4
just to be sure.
# This should be divisible by 4 since they rewards for jobs are always in this format now.
len(rewardsSubset) / 4
47.0
And, we have a perfect divide! Good! This is since each Job always hands out Kmarks, a matching Corp Scrip and Corp Reputation. So, what we need to do now is pull the Job Title from the Units
column and insert it into the next three columns under Job
. To do this, we’re going to build a range of values which are multiples of 4
starting at 0
and up to the total amount of jobs. We don’t want to hard code this since the count of jobs should be expected to change over time.
= len(rewardsSubset) / 4 - 3
topIndex = range( 0, 44, 4) index
Next we’ll want a numpy array of the offsets. We don’t want to use a list because then it will add the values to a python list instead of creating a set of indexes. In effect, we’re trying to take advantage of Broadcasting in numpy. We’ll do an illustration of this quick.
= [1,2,3]
listMistake = np.array(listMistake)
broadcastCorrect
1]] + listMistake, index[1] + broadcastCorrect [index[
([4, 1, 2, 3], array([5, 6, 7]))
Above you can see [4, 1, 2, 3]
is definitely not what we’re after. So, after setting up the proper offset lets make sure we’re getting what we want. I often sanity check my initial design since experience as taught me you can still trip even after the initial testing works. So, lets do that now.
= np.array([1, 2, 3]) offset
# this is how we'll iterate; proof it works.
for i in index[:3]:
= rewardsSubset.iloc[i, 0]
aJob print(f'{aJob} is at index {i}')
New Mining Tools is at index 0
Explosive Excavation is at index 4
Mining Bot is at index 8
And, there we go! We’re getting exactly what we wanted and expected. This is also a good initial test for the loop which we’re going to tuck into a function at the end of all this. So, now to test the logic of swapping the values from the Unit Column
to the Job Column
.
# Do the thing:
= rewardsSubset.iloc[index[0], 0]
aJob = index[0] + offset
indexes 2 ] = aJob
rewardsSubset.iloc[ indexes, 9) rewardsSubset.head(
Units | Rewards | Job | |
---|---|---|---|
0 | New Mining Tools | We are producing new Mining Tools for new Pros... | NaN |
1 | 3800 | K-Marks | New Mining Tools |
2 | 1 | Korolev Scrip | New Mining Tools |
3 | 15 | Korolev Reputation | New Mining Tools |
4 | Explosive Excavation | One of our mines collapsed with valuable equip... | NaN |
5 | 11000 | K-Marks | NaN |
6 | 8 | Korolev Scrip | NaN |
7 | 52 | Korolev Reputation | NaN |
8 | Mining Bot | Our engineers have designed an autonomous mini... | NaN |
for i in index:
= rewardsSubset.iloc[i, 0]
aJob = i + offset
indexes 2 ] = aJob
rewardsSubset.iloc[ indexes,
12) rewardsSubset.head(
Units | Rewards | Job | |
---|---|---|---|
0 | New Mining Tools | We are producing new Mining Tools for new Pros... | NaN |
1 | 3800 | K-Marks | New Mining Tools |
2 | 1 | Korolev Scrip | New Mining Tools |
3 | 15 | Korolev Reputation | New Mining Tools |
4 | Explosive Excavation | One of our mines collapsed with valuable equip... | NaN |
5 | 11000 | K-Marks | Explosive Excavation |
6 | 8 | Korolev Scrip | Explosive Excavation |
7 | 52 | Korolev Reputation | Explosive Excavation |
8 | Mining Bot | Our engineers have designed an autonomous mini... | NaN |
9 | 6900 | K-Marks | Mining Bot |
10 | 9 | Korolev Scrip | Mining Bot |
11 | 62 | Korolev Reputation | Mining Bot |
Perfect! Now all we have to do is cut the Units Columns
where the Job Title still remains. Luckily, the np.NaN
has remained so we can collect the indexes for Job
where that values exists. And, then simply get rid of them.
# Kill the NA's
= rewardsSubset.Job.isna()
cutNA ~cutNA ].head(15) rewardsSubset[
Units | Rewards | Job | |
---|---|---|---|
1 | 3800 | K-Marks | New Mining Tools |
2 | 1 | Korolev Scrip | New Mining Tools |
3 | 15 | Korolev Reputation | New Mining Tools |
5 | 11000 | K-Marks | Explosive Excavation |
6 | 8 | Korolev Scrip | Explosive Excavation |
7 | 52 | Korolev Reputation | Explosive Excavation |
9 | 6900 | K-Marks | Mining Bot |
10 | 9 | Korolev Scrip | Mining Bot |
11 | 62 | Korolev Reputation | Mining Bot |
13 | 7600 | K-Marks | None of your Business |
14 | 10 | Korolev Scrip | None of your Business |
15 | 90 | Korolev Reputation | None of your Business |
17 | 10000 | K-Marks | Insufficient Processing Power |
18 | 11 | Korolev Scrip | Insufficient Processing Power |
19 | 110 | Korolev Reputation | Insufficient Processing Power |
Function to build Job Rewards
Now that we have all this we can push it into a function and run it against all the different Corporation tables.
def buildJobsRewards(data):
# Function to take job rewards data and return a cleaned version
= data[["Name", "Description", "Difficulty"]].copy()
rewardsSubset = ["Units", "Rewards", "Job"]
rewardsSubset.columns
= range( 0, len(rewardsSubset) - 4, 4)
index = np.array([1, 2, 3])
offset
= np.NaN
rewardsSubset.Job
for i in index:
= rewardsSubset.iloc[i, 0]
aJob = i + offset
indexes 2 ] = aJob
rewardsSubset.iloc[ indexes,
= rewardsSubset.Job.isna()
cutNA = rewardsSubset[ ~cutNA ]
rewardsSubset
= rewardsSubset.assign(
rewardsSubset = rewardsSubset['Units'].astype(int)
Units
)
return rewardsSubset
And, the final test!
= buildJobsRewards( site[0] )
KorolevRewards = buildJobsRewards( site[1] )
icaRewards = buildJobsRewards( site[2] ) osirisRewards
9) KorolevRewards.head(
Units | Rewards | Job | |
---|---|---|---|
1 | 3800 | K-Marks | New Mining Tools |
2 | 1 | Korolev Scrip | New Mining Tools |
3 | 15 | Korolev Reputation | New Mining Tools |
5 | 11000 | K-Marks | Explosive Excavation |
6 | 8 | Korolev Scrip | Explosive Excavation |
7 | 52 | Korolev Reputation | Explosive Excavation |
9 | 6900 | K-Marks | Mining Bot |
10 | 9 | Korolev Scrip | Mining Bot |
11 | 62 | Korolev Reputation | Mining Bot |
9) icaRewards.head(
Units | Rewards | Job | |
---|---|---|---|
1 | 4400 | K-Marks | Water Filtration System |
2 | 1 | ICA Scrip | Water Filtration System |
3 | 15 | ICA Reputation | Water Filtration System |
5 | 7500 | K-Marks | New Beds |
6 | 9 | ICA Scrip | New Beds |
7 | 62 | ICA Reputation | New Beds |
9 | 13000 | K-Marks | Station Defense |
10 | 12 | ICA Scrip | Station Defense |
11 | 130 | ICA Reputation | Station Defense |
9) osirisRewards.head(
Units | Rewards | Job | |
---|---|---|---|
1 | 2200 | K-Marks | Lab equipment |
2 | 1 | Osiris Scrip | Lab equipment |
3 | 13 | Osiris Reputation | Lab equipment |
5 | 8100 | K-Marks | Surveillance Center |
6 | 8 | Osiris Scrip | Surveillance Center |
7 | 43 | Osiris Reputation | Surveillance Center |
9 | 8100 | K-Marks | Gun Manufacturing |
10 | 8 | Osiris Scrip | Gun Manufacturing |
11 | 52 | Osiris Reputation | Gun Manufacturing |
Conclusion
Now we’ve got tidy data for all the jobs from all the Corporations and their matching rewards. Next we’ll need to clean the actual tasks which is going to be much harder since there is no consistent formatting. But, we’ll end this post with a simple question using the data we have: Which Corporation gives out the best average Kmarks?
"Rewards == 'K-Marks'").Units.mean() KorolevRewards.query(
22936.956521739132
(round(KorolevRewards.query("Rewards == 'K-Marks'").Units.mean(),2),
round(icaRewards.query("Rewards == 'K-Marks'").Units.mean(),2),
round(osirisRewards.query("Rewards == 'K-Marks'").Units.mean(), 2)
)
(22936.96, 23158.33, 21136.17)
And, the winner is ICA barely over Korolev! Suck Less Osiris!