This post series was inspired by the Korolev Job And Two Smoking Barrels where upon accepting the quest I realized how terrible the rewards were in comparison. For those that don’t play the game a little background will be necessary. The Cycle: Frontier is an Evacuation Shooter game and there are three main Corporations - or Organizations, if you prefer - who act as quest and reward givers in the game. There are two kinds of these: Campaign and Jobs. Whereas the campaign quests will push you along to different areas of the world, instead the Jobs function as a way to collect scrips and excuse to send players to the planet.

There are three kinds of quests: 1. Collect Stuff. 2. Deposit Stuff. 3. Kill Stuff: including players.

For Deposit Jobs, you carry the requested items to a Dead Drop and then deposit the items in question. For this one, it requires you deposit a gun you purchase from the shop. Now, you could find this weapon or loot it from other people but those are not garunteed at all. The Kmark - which is cash, basically - reward is $19,000 and the gun it wants you to deposit is $22,000 so there is no reason to take this job unless you already have this gun. Anyways, lets get to the fun part.

Scraping and Cleaning Cycle Data

So, we’ll start with the normal imports for doing data in Python.

import pandas as pd
import numpy as np
import seaborn as sns
import requests as r

Thankfully, there is an official wiki for the game which is maintained by both the Developers and the Community together. We’re going to pull the data from there as that should be the most up to date data. Like most online data scraping, there is some try-fail loops to getting what you’re after from the webpage. Since there are three organizations, there are three tables with jobs we’d like to pull from the website. After some trial and error I found that using match="Name" was perfect for pulling the tables out of the webpage.

# game taken down
# url = "https://thecyclefrontier.wiki/wiki/Jobs"
url = "https://archive.ph/6pFGb"

site = pd.read_html(url, match="Name",
    converters = {
        "Name": str,
        "Description": str, 
        "Unlocked": int, 
        "Tasks": str,
        "Rewards": str})

You may notice the addition of converters argument above which is a really useful feature I didn’t know previously; basically, if you know the column names coming in then you can tell Pandas what data type you want so you don’t have to convert later. So, what does the data look like?

site[0].head(8)

	Name	Description	Unlock Level	Difficulty	Tasks	Rewards
0	New Mining Tools	We are producing new Mining Tools for new Pros...	4.0	Easy	Collect: 2 Hydraulic Piston 10 Hardened Metals	3800 K-Marks 1 Korolev Scrip 15 Korolev R...
1	3800	K-Marks	NaN	NaN	NaN	NaN
2	1	Korolev Scrip	NaN	NaN	NaN	NaN
3	15	Korolev Reputation	NaN	NaN	NaN	NaN
4	Explosive Excavation	One of our mines collapsed with valuable equip...	7.0	Medium	Collect: 4 Derelict Explosives	11000 K-Marks 8 Korolev Scrip 52 Korolev ...
5	11000	K-Marks	NaN	NaN	NaN	NaN
6	8	Korolev Scrip	NaN	NaN	NaN	NaN
7	52	Korolev Reputation	NaN	NaN	NaN	NaN

That is not really what we were hoping would come in. Checking the actual site, this is caused by a table which exists inside one of the row cells. But - like in the previous post - this can still be used after some work. So, lets get to work!

Fixing the Job Rewards

We dont need most of the columns for what we’re going to be accomplishing so we’re going to pull them out. I’m going to do a copy as pandas sometimes doesn’t play so nicely with updates. What I’ve found is that since Pandas uses pointers underneath, sometimes when doing updates to slices I don’t always get what I expect. So, we’ll splice and copy to only get the data we care about in its own independent dataframe.

rewardsSubset = site[0][["Name", "Description", "Difficulty"]].copy()
rewardsSubset

	Name	Description	Difficulty
0	New Mining Tools	We are producing new Mining Tools for new Pros...	Easy
1	3800	K-Marks	NaN
2	1	Korolev Scrip	NaN
3	15	Korolev Reputation	NaN
4	Explosive Excavation	One of our mines collapsed with valuable equip...	Medium
...	...	...	...
183	470	Korolev Reputation	NaN
184	No Expiry Date	There you are, finally! There's been an accide...	Medium
185	6300	K-Marks	NaN
186	9	Korolev Scrip	NaN
187	62	Korolev Reputation	NaN

188 rows × 3 columns

These column names are not useful so we’re going to correct those so they make sense with the project. We’re going to call the final column Job which will make more sense as the work gets done.

rewardsSubset.columns = ["Units", "Rewards", "Job"]
rewardsSubset.head()

	Units	Rewards	Job
0	New Mining Tools	We are producing new Mining Tools for new Pros...	Easy
1	3800	K-Marks	NaN
2	1	Korolev Scrip	NaN
3	15	Korolev Reputation	NaN
4	Explosive Excavation	One of our mines collapsed with valuable equip...	Medium

So, looking at above we can see that we have extra data in the Job Column which is no longer appropriate. We’re going to simply fill that column with a null value: np.NaN

rewardsSubset.Job = np.NaN
rewardsSubset.head(12)

	Units	Rewards	Job
0	New Mining Tools	We are producing new Mining Tools for new Pros...	NaN
1	3800	K-Marks	NaN
2	1	Korolev Scrip	NaN
3	15	Korolev Reputation	NaN
4	Explosive Excavation	One of our mines collapsed with valuable equip...	NaN
5	11000	K-Marks	NaN
6	8	Korolev Scrip	NaN
7	52	Korolev Reputation	NaN
8	Mining Bot	Our engineers have designed an autonomous mini...	NaN
9	6900	K-Marks	NaN
10	9	Korolev Scrip	NaN
11	62	Korolev Reputation	NaN

So, now for the hard part: getting the Job Title into the Job Column. Looking at the data above, we can see that the Job Title is always stored in a multiple of four. We can confirm this by simply dividing the total number of columns by 4 just to be sure.

# This should be divisible by 4 since they rewards for jobs are always in this format now.
len(rewardsSubset) / 4

47.0

And, we have a perfect divide! Good! This is since each Job always hands out Kmarks, a matching Corp Scrip and Corp Reputation. So, what we need to do now is pull the Job Title from the Units column and insert it into the next three columns under Job. To do this, we’re going to build a range of values which are multiples of 4 starting at 0 and up to the total amount of jobs. We don’t want to hard code this since the count of jobs should be expected to change over time.

topIndex = len(rewardsSubset) / 4 - 3
index = range( 0, 44, 4)

Next we’ll want a numpy array of the offsets. We don’t want to use a list because then it will add the values to a python list instead of creating a set of indexes. In effect, we’re trying to take advantage of Broadcasting in numpy. We’ll do an illustration of this quick.

listMistake = [1,2,3]
broadcastCorrect = np.array(listMistake)

[index[1]] + listMistake, index[1] + broadcastCorrect

([4, 1, 2, 3], array([5, 6, 7]))

Above you can see [4, 1, 2, 3] is definitely not what we’re after. So, after setting up the proper offset lets make sure we’re getting what we want. I often sanity check my initial design since experience as taught me you can still trip even after the initial testing works. So, lets do that now.

offset = np.array([1, 2, 3])

# this is how we'll iterate; proof it works.
for i in index[:3]:
    aJob = rewardsSubset.iloc[i, 0]
    print(f'{aJob} is at index {i}')

New Mining Tools is at index 0
Explosive Excavation is at index 4
Mining Bot is at index 8

And, there we go! We’re getting exactly what we wanted and expected. This is also a good initial test for the loop which we’re going to tuck into a function at the end of all this. So, now to test the logic of swapping the values from the Unit Column to the Job Column.

# Do the thing:
aJob = rewardsSubset.iloc[index[0], 0]
indexes = index[0] + offset
rewardsSubset.iloc[ indexes, 2 ] = aJob
rewardsSubset.head(9)

	Units	Rewards	Job
0	New Mining Tools	We are producing new Mining Tools for new Pros...	NaN
1	3800	K-Marks	New Mining Tools
2	1	Korolev Scrip	New Mining Tools
3	15	Korolev Reputation	New Mining Tools
4	Explosive Excavation	One of our mines collapsed with valuable equip...	NaN
5	11000	K-Marks	NaN
6	8	Korolev Scrip	NaN
7	52	Korolev Reputation	NaN
8	Mining Bot	Our engineers have designed an autonomous mini...	NaN

for i in index:
    aJob = rewardsSubset.iloc[i, 0]
    indexes = i + offset
    rewardsSubset.iloc[ indexes, 2 ] = aJob

rewardsSubset.head(12)

	Units	Rewards	Job
0	New Mining Tools	We are producing new Mining Tools for new Pros...	NaN
1	3800	K-Marks	New Mining Tools
2	1	Korolev Scrip	New Mining Tools
3	15	Korolev Reputation	New Mining Tools
4	Explosive Excavation	One of our mines collapsed with valuable equip...	NaN
5	11000	K-Marks	Explosive Excavation
6	8	Korolev Scrip	Explosive Excavation
7	52	Korolev Reputation	Explosive Excavation
8	Mining Bot	Our engineers have designed an autonomous mini...	NaN
9	6900	K-Marks	Mining Bot
10	9	Korolev Scrip	Mining Bot
11	62	Korolev Reputation	Mining Bot

Perfect! Now all we have to do is cut the Units Columns where the Job Title still remains. Luckily, the np.NaN has remained so we can collect the indexes for Job where that values exists. And, then simply get rid of them.

# Kill the NA's
cutNA = rewardsSubset.Job.isna()
rewardsSubset[ ~cutNA ].head(15)

	Units	Rewards	Job
1	3800	K-Marks	New Mining Tools
2	1	Korolev Scrip	New Mining Tools
3	15	Korolev Reputation	New Mining Tools
5	11000	K-Marks	Explosive Excavation
6	8	Korolev Scrip	Explosive Excavation
7	52	Korolev Reputation	Explosive Excavation
9	6900	K-Marks	Mining Bot
10	9	Korolev Scrip	Mining Bot
11	62	Korolev Reputation	Mining Bot
13	7600	K-Marks	None of your Business
14	10	Korolev Scrip	None of your Business
15	90	Korolev Reputation	None of your Business
17	10000	K-Marks	Insufficient Processing Power
18	11	Korolev Scrip	Insufficient Processing Power
19	110	Korolev Reputation	Insufficient Processing Power

Function to build Job Rewards

Now that we have all this we can push it into a function and run it against all the different Corporation tables.

def buildJobsRewards(data):
    # Function to take job rewards data and return a cleaned version

    rewardsSubset = data[["Name", "Description", "Difficulty"]].copy()
    rewardsSubset.columns = ["Units", "Rewards", "Job"]

    index = range( 0, len(rewardsSubset) - 4, 4)
    offset = np.array([1, 2, 3])

    rewardsSubset.Job = np.NaN

    for i in index:
        aJob = rewardsSubset.iloc[i, 0]
        indexes = i + offset
        rewardsSubset.iloc[ indexes, 2 ] = aJob
        
    cutNA = rewardsSubset.Job.isna()
    rewardsSubset = rewardsSubset[ ~cutNA ]

    rewardsSubset = rewardsSubset.assign(
        Units = rewardsSubset['Units'].astype(int)
    )

    return rewardsSubset

And, the final test!

KorolevRewards = buildJobsRewards( site[0] )
icaRewards = buildJobsRewards( site[1] )
osirisRewards = buildJobsRewards( site[2] )

KorolevRewards.head(9)

	Units	Rewards	Job
1	3800	K-Marks	New Mining Tools
2	1	Korolev Scrip	New Mining Tools
3	15	Korolev Reputation	New Mining Tools
5	11000	K-Marks	Explosive Excavation
6	8	Korolev Scrip	Explosive Excavation
7	52	Korolev Reputation	Explosive Excavation
9	6900	K-Marks	Mining Bot
10	9	Korolev Scrip	Mining Bot
11	62	Korolev Reputation	Mining Bot

icaRewards.head(9)

	Units	Rewards	Job
1	4400	K-Marks	Water Filtration System
2	1	ICA Scrip	Water Filtration System
3	15	ICA Reputation	Water Filtration System
5	7500	K-Marks	New Beds
6	9	ICA Scrip	New Beds
7	62	ICA Reputation	New Beds
9	13000	K-Marks	Station Defense
10	12	ICA Scrip	Station Defense
11	130	ICA Reputation	Station Defense

osirisRewards.head(9)

	Units	Rewards	Job
1	2200	K-Marks	Lab equipment
2	1	Osiris Scrip	Lab equipment
3	13	Osiris Reputation	Lab equipment
5	8100	K-Marks	Surveillance Center
6	8	Osiris Scrip	Surveillance Center
7	43	Osiris Reputation	Surveillance Center
9	8100	K-Marks	Gun Manufacturing
10	8	Osiris Scrip	Gun Manufacturing
11	52	Osiris Reputation	Gun Manufacturing

Conclusion

Now we’ve got tidy data for all the jobs from all the Corporations and their matching rewards. Next we’ll need to clean the actual tasks which is going to be much harder since there is no consistent formatting. But, we’ll end this post with a simple question using the data we have: Which Corporation gives out the best average Kmarks?

KorolevRewards.query("Rewards == 'K-Marks'").Units.mean()

22936.956521739132

(
    round(KorolevRewards.query("Rewards == 'K-Marks'").Units.mean(),2),
    round(icaRewards.query("Rewards == 'K-Marks'").Units.mean(),2),
    round(osirisRewards.query("Rewards == 'K-Marks'").Units.mean(), 2)
)

(22936.96, 23158.33, 21136.17)

And, the winner is ICA barely over Korolev! Suck Less Osiris!