matlab - Optimizing a function that compares 1.7 million entries against itself -
I want to calculate the distance of all the UK postcodes against each other, then the sum of the population of all the postcodes within 1 mile Please. Postcode & amp; The population list is stored in a text file. I am most familiar with Matlab, but I have a stata & amp; PSP is available. The program is currently scheduled to take about 2 weeks, can I do anything to speed up the process ??? Here's my code. Matlab generates the script for importing text data. The distance is done by the mapping toolbox, and the great cycle formula does.
Any help is greatly appreciated.
pcdistance (postcode, pop, latitude, loan)% of the total population for 1 mile radius feed = FOPAN ('ppc.txt', 'a') for UK postcode Imposes; N = length (postcode); % Counts the distance of 1 postcode at a time, against all others, all the rules that do not meet the rules I = 1: n; Dist = []; Dist (:, 1) = Pop; For J = 1: n; Dist (j, 2) = Distance (lat (i), lon (i), lat (j), lan (j), 3 9 63.17); Good = District (1: J, 2) & lt; = 1; End dist = dist (good, :); Tot = sum (dist (:, 1)); Fprintf (1, '% s,% d;', postcode {i}, total) find the sum of the population within the end% 1 mile fclose (fid); End
Here is a small sample of the content, from the txt file. Columns are respectively "postcode, pop, lat, tall".
"BD7 1DB", 749, 5.339, -1.76
"M15 6a", 748,53.46, -2.24 "WR2 6AJ", 748,52.19, - 2.24
"M15 6FF", 745,53.46, -2.23
"IP7 7Ra", 741.52.12.0.96
"CF 62 4AA", 740,51.41, - 3.41
"M2 2 AR", 738,53.47, -2.22 "ng1 4br", 737,52.95, -1.14
"ST 163 AW", 735,52.81, -2.11
"AB25 1 L ", 733,57.15, -2.10
" WF 29Ag ", 730,53.68, -1.50
" DT11 8 RH ", 730,50.86, -2.12
" CW1 5 NP ", 729,53.0 9, -2.41
"TR 12 7 RH", 724, 50.08, -5.25
"ST5 5DY", 723,53.00, -2.27
"Ha 1 3 HP", 723 , 51.57, -0.33
"DL10 7 NP", 722, 54.37, -1.62, "M1 7 HR", 719,53.47, -2.23
"B18 4As", 719,52.49, -1.93
"OX13 6 JB", 716,51.68, -1.30
This is the correct code.
function pcdistance4 (postcode, pop, latitude, loan)% total population for UK postcode is found within radius of 1 mile fid = fopen ('PPC.txt' , 'a'); N = length (postcode); % Pre-allocation distribution = zero (n, 2); Root = zero (n, 1); Tick for I = 1: n; Dist (:, 1) = Pop; Dist (:, 2) = Distance (lat (i), lan (i), latitude (:), lan (:), 3963.17); Good = dist (:, 2) & lt; = 1 & amp; Dist (:, 2) ~ = 0; Tet (i) = yoga (dist (nice, 1)); Total (i) = Total (i) + Pop (i); And Tossy Tick for J = 900001: n; Fprintf (FID, '% s,% d; \ n', postcode {j}, munna (j)); End toc fclose (fid); Finally
You should think with memory-complexity. For example (see comments in the modified function):
function pcdistance (postcode, pop, latitude, loan) fid = fopen ('PPC.txt', 'a '); N = length (postcode); % Pre-allocation distribution = zero (n, n); I = 1: n; Avoid "delete" the variable%, you can overwrite it because the number of elements is always the same% dist = []; Dist (:, 1) = Pop; For J = 1: n; Unfortunately, I have not mentioned the toolbox, but there is a high chance that you can avoid the loop. Probably like some%:% dist (:, 2) = Distance (lat (i), loan (i), latitude, loan, ...)% Try to vector it (z, 2) = distance ( Lat (i), lon (i), lat (j), lan (j), 3 9 63.17); % There is no need of this operation, it is highly redundant and% computational is expensive:% - In the first loop you will check 1% - in the second loop you will examine two elements (1 redundancy)% - Check Elements (J-1 Redundant)% Total Unneeded Tasks 1 + 2 + 3 + ... + N -1 Is% Good = Distribution (1: J, 2) & lt; = 1; End% better this good = dist (, 2) & lt; = 1; % Too expensive to archive% Dist = dist (good, :); Better Indexing Straight Yoga = Yoga (Dist (Good, 1)); Write out the end% dev- iL% as recommended by 1 mile to find the amount of population within fclose (fid); End
Comments
Post a Comment