Big-O definition of this algorithm
I am trying to figure out what Big-O is from the code below.
What the code should do
Basically, I'm trying to select a subset (max size = selectionSize) of random nodes and any edges that exist between them. The selection of random nodes is performed in a loop while
. Having done this, I want to select any edges that exist between the selected nodes.
What i think and why
I think the running time
Edit : ugh on second thoughts ... Nevermind ... The size O = n^2
is where n=selectionSize
. The reason is that although I can increase the size of the elements in nodes
(for example, make it 10000), I do not think that this can affect the algorithm, since I am only iterating over the maximum selectionSize
. The only reason I am a little worried that this is wrong is for a loop while
where I select random items from a list until I have enough. While this can take quite a long time (because it is random), I don't think it affects the overall result in terms of time. nodes
affects it (as it node.getNeighbors()
can be at most nodes
). So I think that if it selectionSize
is equal to the sizenodes
, working time O=n^2
, where n=size of nodes
.
Any advice / hints would be appreciated.
code
// nodes and selectionSize are my input:
int[] nodes = {1,2,3...,1000}; // 1000 elements
int selectionSize = 500; // This can be at most the size of the elements (in this case 1000)
run_algo(nodes, selectionSize);
public void run_algo(int[] nodes, int selectionSize) {
randomNodesList = {};
while(randomNodesList < selectionSize) {
randomNode = selectRandomNode(nodes); // Assume O(1)
if(!randomNodesList.exists(randomNode)) { // Assume O(1)
randomNodesList.push_back(randomNode); // Assume O(1)
}
}
foreach(node in randomNodesList) {
foreach(neighbor in node.getNeighbors()) { // Max. nodes size (in this case 1000)
if (!randomNodesList.exists(neighbor)) { // Assume O(1)
AddEdge(node, neighbor); // Takes O(1) time
}
}
}
}
source to share
if it selectRandomNode(nodes);
works with replacement (the same node can be selected twice) then the big o is undefined since you have probably infinite (you can end up selecting the same node over and over).
If it works without replacement, then it is O(n^2)
(in the worst case, every node can be connected to all other nodes).
Selection notes without replacement:
-
Consider the case where you set the size of the array
n
, for exampleA
, and an empty arrayB
. All elements from A are unique. -
The challenge is to fill with
B
itemsn
randomly selected fromA
. It is desirable thatB
there are no lessk
unique elements.
It can be shown that the likelihood of more than k
unique elements appearing increases with increasing n
(I added equations after the graph).
Thus, in practice, the probability of completing the cycle in one pass (i.e., in less than n
) becomes large, since the difference in n
and k
increases. It's very intuitive if you think about it, math is just cherry on top.
def k_numerator(n, k):
res = 0
sign = 1
for i in range(k, 0, -1):
pow_term = (i ** n)
comb_term = comb(k, i)
prod = pow_term * comb_term
prod = prod * sign
res = res + prod
sign = sign * -1
return res
def p_exactly_k(n, k):
"""
Returns the probability of `B` containing exactly `k` unique elements
(also see notes above)
"""
return (comb(n, k) * k_numerator(n, k)) / (n ** n)
source to share
I'm not 100% sure if I understand this right. But let's break it down: While-Loop runs "selectionSize" - best case and worst case n (where n is the number of nodes)
Therefore the size of randomNodeList is in O (n). On a simple graph, you might have O (n-1) neighbors. So the whole loop should be in O (n ^ 2) (Due to n * (n-1))
the axiom is correct. It is actually impossible to find an upper bound for this algorithm. It is non-deterministic. It depends on your random numbers.
source to share