Tuesday, January 28, 2020

Data Application Development Earthquake and Breast Cancer

Data Application Development Earthquake and Breast Cancer Data Application Development for Earthquake and Breast Cancer Datasets Abstract-This report is a general study of two datasets, the first contains data from the earthquake occurred in the region of Marche, Italy in the year 2016 and the second dataset is mammography data, with mean values of measurements and structures of tumors found in patients, for both studies different techniques related to data science were applied, with the intention of revealing conclusions that a priori are impossible to visualize. Keywords-Italy Earthquake, Mammongraphy studies, MapReduce algorithm, Python. With the high processing power that modern computers have acquired, one of the scientific branches that have been most developing is data science, which consists of the generalized extraction of knowledge from information and data. Unlike statistical analysis, data science is more holistic, more global, for using large volumes of data to extract knowledge that adds value to an organization of any kind. In this project, the breast cancer dataset contains information on the geometry, size and texture of tumors found in approximately 5100 patients. The main idea with this database is to construct a predictive model that will be able to detect when a tumor is carcinogenic in other words, predict whether the cancer is benign or malignant, from the descriptions of the same one. In the other hand, the second dataset contains information about the earthquake that occurred in Italy in year 2016, contains all the replicas that occurred by three days after and all earthquakes are geotagged, with this dataset the main idea is to do data mining, to visualize the information of an innovative way, applying geospatial theory and statistical techniques specific of data science. A. Italy 2016 Earthquake Dataset This database is Open-Source accessible to the community and is part of the extensive catalog offered free of charge by the Kaggle website, its structure is as follows: template dataset Time Latitude Longitude Depth Magnitude UTC time WGS87 WGS87 Km Richter scale It has 8086 records with full data history, each row represents an earthquake event. For each event, the following properties are given: the exact timing of the event in the format Y-m-d hh:mm:s.ms the exact geographical coordinates of the event, in latitude and longitude the depth of the hypocenter in kilometers the magnitude value in Richter scale The dataset was collected from this real-time updated list from the Italian Earthquakes National Center. From now on we will call this dataset A B. Breast Cancer (Diagnostic) Data Set Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in [1]. Attribute Information: 1) ID number 2) Diagnosis (M = malignant, B = benign) 2)Ten real-valued features are computed for each cell nucleus: (a) radius (mean of distances from center to points on the perimeter) (b) texture (standard deviation of gray-scale values) (c) perimeter (d) area (e) smoothness (local variation in radius lengths) (f) compactness (perimeter^2 / area 1.0) (g) concavity (severity of concave portions of the contour) (h) concave points (number of concave portions of the contour) (i) symmetry (j) fractal dimension (coastline approximation 1) 3) The mean, standard error and worst or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius. 4) All feature values are recoded with four significant digits. This database was obtained from Kaggle website. It belongs to their repository and is open to scientist of the world that want to study it. From now on we will call this dataset B Knowledge extraction is mainly related to the discovery process known as Knowledge Discovery in Databases (KDD), which refers to the non-trivial process of discovering knowledge and potentially useful information within the data contained in some information repository [2]. It is not an automatic process, it is an iterative process that exhaustively explores very large volumes of data to determine relationships. It is a process that extracts quality information that can be used to draw conclusions based on relationships or models within the data. A. Data selection Both databases were carefully chosen based on the following details: Reliable source or repository, which guarantees the reliability of the data, for this report the source is Kaggle who maintain a database open to the public and that users can comment. Data without an excessive amount of white space, since having to fill this spaces with 0 can cause distortions in the model, making the predictions or conclusions of the studies are invalid. That they contain at least 5000 rows, to make substantial the study and the conclusions had measurable. B. information preprocessing For both datasets, some simple statistical tests were performed with the intention of filling the missing data in the most effective way. For example, for the data of the B the standard deviation and the mean value was calculated, besides raising a frequency histogram to check that the data followed a Gaussian distribution, in fact the data is distributed in this way, so it was completed with values taken randomly based on the mean and standard deviation of the data, this way ensures that the missing data does not provide incorrect information. For the data of A, the average values were obtained and the latitudes and longitudes of each exact point where the earthquake occurred, rounded off in order to be able to made a geospatial label with a region of each Italian province. C. Transformation For both datasets, MapReduce algorithm was applied it is based on the HDFS data architecture. The idea is to be able to map key values, with each of the data and its header, so that the access to them is efficient, with this it is tried to give robustly to data, in addition to reducing the processing times. The main idea of this type of algorithm is to be able to maintain the data in distributed systems, although for this project only a single node was configured. D. Data Mining At this stage of the process, it is already clear how are data distributed, and it is where we decide which Machine Learning or Data Mining algorithms to apply. For the case of data set B, we decided Machine Learning algorithm based on logistic regression, starting from the following arguments: It was verified that the data follow a linear distribution and are correlated with each other. As the result is a decision, Benign or Malignant (1 or 0) The most intuitive is to apply the logistic regression to predict the diagnoses. For the second set of data the technique used will be the a posteriori study of the cataclysm with the intention of revealing conclusions about earthquake, focused on the geospatial area, starting with the labeling WGS87 and with the coordinates of each earthquake it is possible to construct a density of earthquakes by region, With this data it is possible to determine which region was most affected, which was the epicenter of the earthquake and to determine if there is a correlation between the depth of the earthquake and the magnitude. There is no period after the et in the Latin abbreviation et al. The abbreviation i.e. means that is, and the abbreviation e.g. means for example. The implementation was made in Python version 2.7. There are a few key libraries that will be used. Below is a list of the Python SciPy libraries required for implement algorithms for B: Scipy, numpy, matplotlib, pandas sklearn, patsy and statsmodels. And other few more for implement A: Pandas, Numpy, Matplotlib, Basemap, Shapely, Pysal, Descartes, Fiona, Pylabs and Statsmodels, and the architecture for store and read the data is the Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications. HDFS is built to support applications with large data sets, including individual files that reach into the terabytes. It uses a master/slave architecture, with each cluster consisting of a single NameNode that manages file system operations and supporting DataNodes that manage data storage on individual compute nodes. In the next image, Fig. 1 are exposed the workflow diagram for the Machine Learning algorithm applied to B dataset Figure 1: Workflow for Machine Learning algorithm And in the second one, Fig. 2 the workflow for dataset A, this workflow was constructed from the selected methodology, the idea is to follow this pattern of work to increase the productivity of research as they are work frames highly tested by qualified researchers in the area. Figure 2: Workflow for Data Mining research For the data set B, a recursion stage is considered in case the final predictions are not satisfactory, this would entail rethinking the model and to get everything values again. For data set A, the diagram is focused on maximum representation of the data to extract a substantial number of conclusions from graphs. A. Dataset A The first result obtained is a map of the central region of Italy with each of 8000 points where earthquakes occurred. Figure 3: Scatter ploting with administrative subdivision Weve drawn a scatter plot on Italy map Fig. 3, containing points with a 50 meters diameter, corresponding to each point of A dataset. This is a first step, but doesnt really tell anything interesting about the density per region merely that there were more earthquakes in Marche Italy region than in the outer places. Figure 4: Density ploting with administrative subdivision Now we can see how was the distribution Fig. 4 of the earthquake. It is clear on the map that the regions most affected were Lazio, Marche and Umbria. Figure 5: Magnitude rolling mean Most of the earthquakes occurred at a depth of 10km. This can be seen in next graph Fig. 6 by a frequency histogram of depth. Figure 6: Frequency Histogram The following table shows the 5 earthquakes with the greatest impact and their regions where they occurred. table II: Greater magnitude earthquakes Time Region Depth Magnitude 2016-08-24 Lazio 8.1 6.0 2016-08-24 Umbria 8.0 5.4 2016-10-26 Umbria 8.7 5.4 2016-10-26 Brescia 7.5 5.9 2016-10-30 Brescia 9.2 6.5 B. Dataset B We are going to look at two types of plots: Univariate plots to better understand each attribute. Multivariate plots to better understand the relationships between attributes. 1) Univariate Plots: We start with some univariate plots, that is, plots of each individual variable. Given that the input variables are numeric, we can create box and whisker plots of each. Figure 7: whisker plots Fig. 7 gives a much clearer idea of the distribution of the input attributes It looks like perhaps most of the input variables have a Gaussian distribution. This is useful to note as we can use algorithms that can exploit this assumption also this can be seen in Fig. 8. Figure 8: Frequency histogram 2) Algorithm evaluation: In this step we evaluated the most important algorithms of Machine Learning in search of which is best adapted to the data. we used statistical methods to estimate the accuracy of the models that we create on unseen data. We also want a more concrete estimate of the accuracy of the best model on unseen data by evaluating it on actual unseen data. That is, we were held back some data that the algorithms will not get to see and we will use this data to get a second and independent idea of how accurate the best model might actually be. We split the loaded dataset into two, 80% of which we used to train our models and 20% that we will hold back as a validation dataset. We evaluated 6 different algorithms: Logistic Regression (LR) Linear Discriminant Analysis (LDA) K-Nearest Neighbors (KNN). Classification and Regression Trees (CART). Gaussian Naive Bayes (NB). Support Vector Machines (SVM). This is a good mixture of simple linear (LR and LDA), nonlinear (KNN, CART, NB and SVM) algorithms. We reset the random number seed before each run to ensure that the evaluation of each algorithm is performed using exactly the same data splits. It ensures the results are directly comparable. Figure 9: Algorithm comparison LR: 0.658580 (0.027300) LDA: 0.661676 (0.026534) KNN: 0.606749 (0.023558) CART: 0.569616 (0.041578) NB: 0.621194 (0.032784) SVM: 0.641823 (0.025195) The LR algorithm was the most accurate model that we tested. Now we want to get an idea of the accuracy of the model on our validation set. This will give us an independent final check on the accuracy of the best model. It is valuable to keep a validation set just in case you made a slip during training, such as overfitting to the training set or a data leak. Both will result in an overly optimistic result. We can run the LR model directly on the validation set and summarize the results as a final accuracy score, a confusion matrix and a classification report. The accuracy is 0.75 or 75%. The confusion matrix provides an indication of the 25 errors made. As we can see the data science has a wide field of work, in areas so diverse that for the case of this report ranging from medicine to cartography and seismology. With this report, it is evident how important the Machine Learning algorithms in cancer diagnosis, although this small case in study is not perfect, there are more advanced tools and more sophisticated algorithms that allow penetrating in this field of An amazing form, the author recommend a degree project where Deep Learning algorithms and deep neural networks are applied in the diagnosis of diseases. It is certainly a prominent field. On the other hand, in the first dataset, it was possible to explore tools for the management of maps and the placement of big amounts of data on these, with the main idea of à ¢Ã¢â€š ¬Ã¢â‚¬ ¹exposing results that looking at the raw data is impossible to observe. This allows you to find new points of view about phenomena already happened and learn from them to improve infrastructures or tools. In short, data science is a field in full swing that will give much to talk about in recent years, we live in an age where information is power and manipulate and understand information are the tools of the future. References K. P. Bennett and O. L. Mangasarian: Robust Linear Programming Discrimination of Two Linearly Inseparable Sets, Optimization Methods and Software 1, 1992, 23-34 Williams, G. J., Huang, Z. (1996, October). A case study in knowledge acquisition for insurance risk assessment using a KDD methodology. In Proceedings of the Pacific Rim Knowledge Acquisition Workshop, Dept. of AI, Univ. of NSW, Sydney, Australia (pp. 117-129).

Monday, January 20, 2020

TRICARE: The Restructuring of Military Healthcare System in Response to

In the U.S and other nations of the world, the health expenditure and number of physicians increase as the economy expands. However, physician shortage is of a great concern globally, which the U.S and the Military Healthcare System (MHS) are no exceptions. According to Garber (2004) â€Å"a shortage exit when there is unsatisfied demand, which occurs when the quantity of a good or service is less than what people will be willing to buy at the current price†. For example a long wait time to get an elective surgery done, or a long wait for a patient to get an appointment to see the doctor are evidence of physician shortage. Another definition of shortage is â€Å"having a projected supply of physicians that meet less than 80% of the forecasted demand or need, calculated at the estimated means (Scheffler, Liu, Kinfu, & Dal Poz, 2007). The World Health Organization report (2006) estimated that, 57 countries had absolute shortage of 2.3 million physicians. This shortage acco rding to prior studies implied the lack of a sufficient number of health care professionals to deliver skilled health interventions such as child-birth. Schaffer et al. (2007) projected the global supply of the physician workforce to balance the demand using the demand base model and sufficient surplus in the year 2015. Despite this projection of surplus and balance of the physician workforce globally, the problem of shortage will still remain with some countries and nations as a result of distributional problems that continue to persist, and Africa for instance will need about 65% increases in supply of physicians by the year 2015 (Scheffler et al., 2007). According to Cooper (2004 & 2005) the shortage of physicians in the U.S was related to the economic capacit... ....S and overseas to supplement the care provided to the growing beneficiary population in the MTFs. The MTF is the primary health care facility for TRICARE. TRICARE PCP shortage is due to deployment to war zones, humanitarian missions and special combat skill training. Throughout the research, attempts will be made to respond to the primary question and then the other sub questions in relation to; TRICARE background history, epidemiology, physician types, administration, policies and law, finance, personnel, marketing, ethical issues, beneficiary complaints and satisfaction. Other areas include the role restructuring plays in resolving the beneficiary complaints and the impact the restructuring of TRICARE will have on health care delivery to beneficiaries. The summary, recommendations and conclusion will be addressed finally to complete this research paper.

Sunday, January 12, 2020

Language Autobiography Essay

Being a girl of a mixed ethnic background, you can imagine the diversity of language used across my family. The dialects and accents have a wide variety as my family are spread all across the globe. My mother carol is British born and bred in the Essex country side. Whereas my father ahmed is, half Lebanese and half Palestinian. My mum’s first language is English and she speaks in standard English, this could be because of her profession as a nurse has an influence on her speech and it wouldn’t be professional of her to constantly use colloquial language. My father’s first language is Arabic, the Palestinian dialect Arabic. There are so many dialects of Arabic sometimes it seems like it’s a completely different language! He can also speak French as fluent as he can Arabic because French is also a main language in Lebanon. he is also fluent in English, but he has an Arab accent. My father lives in Lebanon so his dialect of Arabic has changed to the Lebanese dialect because of his surroundings but he still has a twang of the Palestinian dialect. The main languages in my family are English and Arabic, but there are so many dialects, such as Egyptian, Jordanian, Emirati, Lebanese, Moroccan, Saudi Arabian, Syrian Arabic and Essex accents, Dorset accent, Scottish accent, American accent, Australian accent. This is just the start of the variety of language in my family! So you’re probably thinking, what is my first language? Well, I was born in the United Arab Emirates in the Emirate of Dubai. Yes, I think it too†¦ why did I immigrate to sunny England? Growing up in Dubai my first language was English because my mother’s Arabic was very basic; however I was fluent in Arabic and could also speak some Tagalog as I was brought up with a Pilipino nanny, Lily. I immigrated to England when I was about 4 or 5 years old, I was constantly speaking English. I remember some of my mum’s friends telling me I had a slight American accent. But my accent quickly changed because of influences around me in school. My surname is Said, but it’s pronounced â€Å"Syed† and I remember reading the Biff and Chip books in my first school and saying â€Å"and Chip Syed this†. My teacher found it highly amusing! Ever since I moved to England, over the years I slowly forgot how to speak Arabic as I got out of the habit of speaking in Arabic often. Now I only know greetings and little phrases in Arabic. Trying to learn Arabic again was extremely difficult because I’m so used to the rules in the English language such as the â€Å"Ough† sound. Being so used to certain rules really affects trying to learn a new language, especially Arabic. Learning Arabic was very different to English and the Arabic alphabet has more letters than the English alphabet, which include sounds as well as letters. Also not every word in Arabic can be translated perfectly into English, and there is no word in the English language for it. Sometimes it’s hard to get a near enough definition of the word without meaning something else. Also in Arabic they can have one word which in the English language translates to a group of words or a sentence. From my experiencing of learning Arabic again I have noticed that the language is very cultural and influence by religion, for example a lot of words or phrases refer to god (Allah). However not just Muslims and religious people use these words, these words are used by all Arabic speakers. In the Spanish language I realized a difference in tenses. In English there are only three tenses, present, past and the future. Whereas in the Spanish language there are many more. This makes its complicated and more difficult to learn as realistically there is only 3 tenses, and it’s hard to picture other tenses. I would describe my accent as a southern English accent. My cousins who live in Essex say that I have a â€Å"brightonian† accent, is there such thing? According to my cousins, people from Brighton raise their tone at the end of every sentence like they are constantly asking questions. I can’t notice myself doing it or other people doing it around me. The way I talk changes depending on the context. For example, when I’m with my friends I use a large amount of colloquial language. Whereas when I’m with my mum or teachers I would not use this language, I would talk in a more Standard English way. Having a lot of friends from an ethnic community, I’ve learnt a lot of slang and colloquial words. Even though these friends are from an Arabic background, I would never talk to my family in the Middle East in this way. I think I change the way I speak to different people, depending on who it is to make a good impression and to make my language appropriate to the situation. The different use of language always comes back to the context its used in.

Saturday, January 4, 2020

Analysis Of Why The Caged Bird Sings - 871 Words

In the poem, â€Å"I Know Why The Caged Bird Sings† by Maya Angelou, one bird is free to fly happily and carefree while another is caged and can do nothing but sing for freedom. Written during a time period of social unrest, Maya Angelou uses literary techniques in this poem to effectively emphasize the impacts of racism in America. One literary technique used throughout the poem is tone. In stanza one, the tone is peaceful and happy as Angelou describes â€Å"the orange sunrays†, while the bird â€Å"leaps†,†dips†, and â€Å"floats†. The author creates this joyful tone in the first stanza to better contrast the depressing and the unnerving tone of the second stanza. In the second stanza the birds’ â€Å"wings are clipped† and â€Å"his feet are tied†. When a†¦show more content†¦The free bird represents freedom and those who have it. He is able to go as he pleases and do whatever he wants to do. He â€Å"thinks o f another breeze† because he isn’t burdened with wanting to be free. He can’t wish to be free when he already is. The other bird whose â€Å"wings are clipped and his feet are tied† symbolizes how people of color in America feel. Unlike the free bird, the other sang of things unknown but â€Å"longed for still†, who are born into today’s society have no idea what freedom of oppression feels like but they still wish and fight for it. While the birds are large symbols in this poem, Maya Angelou uses many other small symbols to emphasize the impacts of racism. The â€Å"distant hill† where the song of the caged bird can be heard represents the people able to help. Another symbol is the bird’s song. While it could represent calls for help, it could also represent the songs, poems, and literature created by people of color to express themselves and spread awareness of their situation. Characterization is another technique that Maya Angelou uses to help the reader better understand the effects of racism. The characterization in the poem â€Å"provides an effective contrast with the bird that is caged† (Enotes). The characterization of the caged bird’s shadow shouting â€Å"on a nightmare scream† provides the feeling of helplessness of those who are affected by racism. The free bird, however, is characterizedShow MoreRelatedAnalysis Of The Poem I Know Why The Caged Bird Sings 1267 Words   |  6 PagesJoy McQueary Ms. Ball AP English Language 16 May 2017 SAHC: J.M. I Know Why the Caged Bird Sings â€Å"Wouldn’t they be surprised when one day I woke out of my black ugly dream, and my real hair, which was long and blonde, would take the place of the kinky mass that Momma wouldn’t let me straighten?† (4) A theme in I Know Why the Caged Bird Sings is Angelou’s identity struggle as a black female. During this time in the country, colorism and a European standard taught that having black features wasRead MoreAnalysis Of I Know Why The Caged Bird Sings837 Words   |  4 Pagesanalysing the poem I Know Why the Caged Bird Sings by Maya Angelou. This poem explores the theme of Oppression which illustrates the nature of helicopter parenting upon todays youth. We will include a contemporary source which likewise explores this challenging youth issue. Essentially, this poem displays the damage that helicopter parenting can have on a child’s youth, oppressing them, taking their childhood away from them. Angelou’s poem â€Å"I Know Why the Cages Bird Sings† expresses the idea thatRead MoreI Know Why The Caged Bird Sings By Maya Angelou Analysis840 Words   |  4 Pages There is one person that is a civil rights activist, memoirist, a poet and above all a woman, this person is none other than Maya Angelou. Angelou has been a famous American poet since the release of her 1969 autobiography, I Know Why The Caged Bird Sings. Angelou has inspired many people by telling her life story to the public, but not only did she inspire, she also created a very different and personal point of view of the world we live in. The poem’s she has written transition from painRead MoreAnalysis Of The Poem I Know Why The Caged Bird Sings 934 Words   |  4 Pagesface, your country and say simply very simply with hope good morning†. Maya Angelou was born on April 4, 1928, in St. Louis, Missouri. She was a writer and civil rights activist, Maya Angelou was well known for he r memoir in 1969, I Know Why the Caged Bird Sings. Maya Angelou made literary history being the very first nonfiction best-seller by an African-American woman. In 1971, Angelou published Just Give Me a Cool Drink of Water Fore I Die that won the Pulitzer Prize-nominated poetry collectionRead MoreAnalysis Of Angelou s I Know Why The Caged Bird Sings984 Words   |  4 Pagesand poem â€Å"I Know Why The Caged Bird Sings.† The book is about her life struggles and the poem is a metaphor about a bird that seeks freedom. The final stanza of the poem reads â€Å"The caged bird sings with a fearful trill of things unknown but longed for still and his tune is heard on the distant hill for the caged bird sings of freedom.† This excerpt of the poem demonstrates how she yearned for her people to get the freedom they deserved. The comparison of the caged bird and the bird that’s free canRead MoreI Know Why The Caged Bird Sings By Maya Angelou Analysis1126 Words   |  5 PagesThe type of language that an author uses in his or her work can greatly impact the outcome. One such example of this was in Maya Angelous I Know Why the Caged Bird Sings, in which she used certain types of language to characterize herself when she was younger and the society that she grew up in. Her choice of language used in the end of Chapter 16 helps to characterize her desire to quit working for Mrs. Cullinan, her resistance to the discrimination that she puts up with while she works, and theRead MoreRhetorical Analysis Of I Know Why The Caged Bird Cannot Sing794 Words   |  4 PagesIn Francine Prose’s essay â€Å"I Know Why the Caged Bird Cannot Sing† Prose tends to evoke her unsureness on why schools use certain books to teach students their moral values. Prose argues that certain books should be taught in English classes, that in fact, teach students their values. Prose uses several literary examples, such as Frankenstein, How To a Kill A Mockingbird, The Great Gatsby, etc. She also provides several controversial opinions, such as using different books to try and teach studentsRead MoreRhetorical Analysis Of Maya Angelou s I Know Why The Caged Bird Sings 1036 Words   |  5 PagesFily Thiam English 002 Mrs. Vilato 9 April 2015 Rhetorical Analysis on â€Å"Graduation† by Maya Angelou In Graduation, a chapter in her autobiography â€Å"I Know Why the Caged Bird Sings†, Maya Angelou talks vividly about her middle school graduation in the segregated South. Graduation is an important milestone in most people’s life, as they get a degree and move on to their next level, something better and more important, with the hope that they can use their new knowledge to achieve their life goals andRead MoreI Know Why The Caged Bird Sings By Maya Angelou Analysis1661 Words   |  7 Pagesof speech. Equally, authors like Maya Angelou have the freedom of speech in our country to write, and in Angelous case, the freedom to write about her life. Maya Angelou, one of the most banned authors in The United States, wrote I Know Why The Caged Bird Sings. Her autobiography depicts rape, explicit language and racism. It has been used in educational settings such as high schools and universities and should be celebrated for its elegant prose and creating new literary avenues . This novel shouldRead MoreLiterary Analysis of I Know Why the Caged Bird Sings Essay2756 Words   |  12 PagesThomas Lim December 9, 2010 English 2 Professor Padilla Themes of Racism and Segregation in I Know Why the Caged Bird Sings By Maya Angelou The purpose of this paper is to introduce, discuss, and analyze the novel I Know Why the Caged Bird Sings by Maya Angelou. Specifically it will discuss the themes of racism and segregation, and how these strong themes are woven throughout this moving autobiography. Maya Angelou recounts the story of her early life, including the racism and segregation